laitimes

The large model of "general knowledge + industry", the empowerment path of "Alaya meta-knowledge".

author:Brother Tao said things

In November 2023, the "Analysis Report on China's AI Large Model Innovation and Patent Technology" released by the National Industrial Information Security Development Research Center and the Electronic Intellectual Property Center of the Ministry of Industry and Information Technology showed that the total number of patent applications for large models in mainland China has exceeded more than 40,000, and innovation in large model-related fields has become increasingly active.

Compared with the large model for the to C market, how to deeply cultivate the industry, fully understand the industry knowledge, and go deep into the business process of industry customers is the key to forming industry barriers or differentiated competitive advantages. According to the "Artificial Intelligence Large Model Experience Report 3.0" released by the China Enterprise Development Research Center of the Xinhua News Agency Research Institute, large model manufacturers are competing for each other in terms of technical strength, and different manufacturers have their own advantages in product characteristics and advantages.

In addition to the large model manufacturers and the large model itself in the center of the boom, those enterprises that empower the industry landing of the large model and provide relevant basic software, hardware and service support should not be just "unsung heroes", on the contrary, they are the indispensable and reliable guarantee for industry users to use the large model.

The most important thing is the landing of the large model

For many infrastructure providers and service providers who are committed to the inclusion of large models, large models are an important part of AI infrastructure and services, and they are also a fulcrum for leveraging the application of AI in the industry. They develop and optimize large models to provide a path and platform for enterprise users who do not have the ability to develop large models independently, or do not need to develop large models themselves, so that users from all walks of life can enjoy the dividends brought by large models now.

As an important part of the basic software facility AIFS (AI Foundation Software), the company's self-developed multi-modal large model series DataCanvas Alaya provides enterprise users with a Foundation model, which is convenient for users to train and fine-tune their own large models on this basis.

The greater challenge for large model training is how to accelerate the training of large models while reducing computing power consumption, and after the training is completed, better adjust the model, so that it can "fly into the homes of ordinary people", so that the majority of small and medium-sized enterprises can benefit.

The reason why DataCanvas is dedicated to creating the Alaya meta-knowledge series of large models is also the original intention. The so-called meta-consciousness comes from the Buddhist term, which means the innate ability to know. The Alaya meta-consciousness series of large models itself integrates and absorbs a variety of capabilities, aiming to perceive all the knowledge of human beings in ancient and modern times, the operation laws of the outside world, scientific principles, etc. to the greatest extent through the general ability of the large model, and on this basis, it can better support various human businesses and have human-like capabilities.

On the one hand, enterprise users can conduct secondary training or fine-tuning on top of the Alaya meta-knowledge model to meet the needs of business applications, and on the other hand, the DataCanvas has its own intelligent computing center, in which the Alaya meta-knowledge series of large models are deployed, and small and medium-sized enterprise users can directly call the large model services provided by the DataCanvas.

Driven by the Alaya meta-knowledge series of large models, DataCanvas is actively looking for the landing scenarios of large models in enterprise business. At present, in addition to the general model, the company has also launched a large model for the financial industry, and will release more large models for the industry in the future to empower and promote the popularization of large model applications.

The difference of the meta-knowledge model

As the "100-model war" gradually becomes a climate, the industry, academia and the media have released their own large-scale model evaluation lists. However, due to the lack of recognized and effective evaluation standards and methods, as well as the different focuses of different rankings, the results of different rankings are very different, or even vastly different. Openness, fairness and impartiality evaluation can indeed provide useful reference and reference for industry users to choose large models, but in addition to some key technical indicators, whether large models can effectively solve the business pain points of industry users, not only easy to use, but also easy to use, should be an important criterion for selecting large models.

The Alaya meta-knowledge model is a white-box model of "general knowledge + industry" developed by DataCanvas. As one of the core capabilities of DataCanvasAIFS, it adheres to the concept of open and friendly open source, and provides a series of pre-trained large models with different configurations and parameters, with industry-leading capabilities and technologies, giving users greater freedom in AI innovation capabilities and accelerating the implementation and application of large models in multiple business scenarios.

The large model of "general knowledge + industry", the empowerment path of "Alaya meta-knowledge".

1. The "white box" large model provides users with more freedom.

Nowadays, many large models on the market are "black boxes", and although the algorithms and architectures are open-sourced to a certain extent, allowing users to train on them, they are still subject to many limitations. For example, although users are allowed to use the large model, they cannot be fine-tuned, or users are not allowed to make their own unique commercial applications. Apache 2.0 license allows industry users to freely train and fine-tune their own large models on top of the Alaya Metacognition model.

2. "Multimodality" is a necessary premise and a means of innovation.

The so-called multimodal large model refers to a machine learning model that can process multiple information from different modalities, such as images, speech, text, etc. Today, multimodality has become the "standard" for large models. The Alaya meta-recognition model can not only support text and images, but also support time series data and structured data.

For example, we may often see a similar description in equipment repair manuals - "As shown in the picture, the fault point is the location shown in the red circle in the picture...... "If it is a large model based on documentation alone, it is difficult to understand the correct meaning of this sentence. Graphic representation is a typical multimodal application.

For the Alaya meta-knowledge model, there is no doubt that it is deeply multimodal. DataCanvas wants to do more, that is, to conduct in-depth research on "data" as one of the important modalities. To apply large models, data is a threshold that must be crossed. DataCanvas has many successful cases that have been implemented in areas such as natural language understanding, literati graphs, and code generation, which are relatively good applications of large models. However, there are only a handful of companies that study data as a modality and train large models, and DataCanvas is at the forefront. For example, DataPilot, as a new paradigm of data processing and a new generation of data architecture tool based on large models, can help users realize intelligence and automation in the whole life cycle of modeling by making full use of the general text understanding and generation capabilities of meta-cognitive large models, as well as fine-tuning and optimization in the data field.

In the process of training large models, data processing, data transformation, data classification, data labeling, and data storage are time-consuming and laborious tasks. In the past, if there was no ready-made data, the original storage data had to be found, converted, and cleaned before it could be imported into the data warehouse and displayed. Effectively shortening this long chain of data processing is of great value for the application of large models, which can effectively reduce the manpower consumed in the process of training and application of large models, and can also improve the effect of large model applications. DataPilot can greatly reduce the technical threshold of data integration, governance, modeling, computing, query, analysis, and machine learning modeling, reduce the cost of data-driven business development, and accelerate the process of enterprise digital innovation.

The large model of "general knowledge + industry", the empowerment path of "Alaya meta-knowledge".

3. New model training mechanism to reduce costs and increase efficiency.

As we all know, the cost of training a large model is very high. It is impossible for a large model to accurately understand its meaning with just one sentence, but to input and process more verbal contexts, so the consumption of computing power is huge. Large models must accommodate more and longer content. The Alaya meta-knowledge model adopts an improved attention mechanism, a longer context window, composable fine-tuning, and a new masking mechanism to ensure the accuracy of its understanding and improve the processing speed on the basis of effectively reducing the computing power consumed by training.

DataCanvas has a R&D team, which is responsible for the R&D and innovation of the training mechanism. The Alaya meta-cognitive large model adopts a new Attention mechanism, which can not only reduce the consumption of computing power, but also achieve effective data alignment for multimodality before and after training. This is a unique technology of DataCanvas, which can well accommodate such a training method as multimodality during training.

4. A series of model matrices to better meet the needs of "general knowledge + industry".

Alaya metacognition is not a large model, but a series of large models, with model parameters from small to large, covering from general knowledge to vertical industries, which can better meet the diverse needs of users. The Alaya-7B Foundation Model and the Alaya-7B Chat Model, as well as the LMS model running tool and the LMPM prompt word manager in the LLMOps large model toolchain, can effectively promote the practical application of large models in various industry scenarios.

Alaya-7B:

The large model of "general knowledge + industry", the empowerment path of "Alaya meta-knowledge".

LMS Model Runner:

LMPM Prompt Word Manager:

The large model of "general knowledge + industry", the empowerment path of "Alaya meta-knowledge".

For specific application scenarios, DataCanvas has developed TableAgent data analysis agent based on the Alaya meta-knowledge large model and LLMOps large model toolchain, which can independently use advanced modeling technologies such as statistical science, machine learning, and causal inference to mine value from data after fully understanding user intentions, and then provide analytical views and deep insights to guide actions, so as to achieve the goal of everyone being a data analyst.

TableAgent公测地址:https://tableagent.DataCanvas.com

The large model of "general knowledge + industry", the empowerment path of "Alaya meta-knowledge".

In the fields of finance, manufacturing, and new energy, the Alaya meta-knowledge model has been implemented in many scenarios, and has realized the integration and innovation of tool chains, large models, and industry applications. For example, the Alaya meta-knowledge model combined with the DingoDB multi-model vector database, one of the tool chains, provides enterprises with an enterprise knowledge steward solution, allowing enterprises to have their own intelligent knowledge stewards immediately.

The positioning of DataCanvas to empower to B applications through software infrastructure also determines the creation of a white-box large model, so that users from all walks of life can train and optimize their own large models in their own professional fields through meta-knowledge large models and AIFS basic software. Starting from general knowledge and applying the accumulated experience to different industries, this is the original intention of DataCanvas to create a large model of "general knowledge + industry".

The Alaya meta-knowledge large model is included in AIFS, and AIFS provides the data preparation and training fine-tuning methods required by the large model, and the user only needs to tell the AIFS data where it is, and it can automatically complete the acceleration of infrastructure software, data processing, data marking, etc. Users no longer have to pick and choose different tools and string them together as they did before. AIFS automatically completes the required concatenation and processing work, greatly reducing the burden on users. The customization capability of the Alaya meta-knowledge model and AIFS is also an important reason why it is favored by users.

"Large model + small model" is both hands

Now when it comes to artificial intelligence, we must mention large models. But this does not mean abandoning the small model and moving all at once to the large model. DataCanvas believes that the future ecology must be "large model + small model", and models with different characteristics can be adapted to different scenarios.

It is undeniable that large models have an innate advantage in the face of general, logical reasoning ability, and human-like natural language processing. However, at present, the application scenarios of large models are not rich enough, and they are still being explored. In those applications that require accurate calculations or specific application scenarios, small models are still indispensable. On top of the large model, the vertical distillation technology is used to distill it onto the small model, which can better meet the application needs of scientific computing, attribution analysis and other scenarios. In addition, there are many AI engines on the market that provide support for business by invoking and orchestrating some small models. From the perspective of product line layout, Jozzon CloudCanvas is undergoing internal adjustments, hoping that more businesses and products can be supported by large models in the future, and will also firmly rely on "large models + small models" to continue to expand.

According to the "Chinese Artificial Intelligence Large Model Map Research Report" released by the Institute of Scientific and Technical Information of China, more than half of the large models released in China have been open sourced. The purpose of DataCanvas is to enable enterprises of all sizes, especially small and medium-sized enterprises and even individuals, to complete the training, fine-tuning and commercialization of large models on top of the large models and related infrastructure of DataCanvas. DataCanvas is committed to the construction of an artificial intelligence open source ecosystem, hoping to occupy a place in it and take deep roots.

Read on