The real scene refines the large model, and Quark uses AI to "speed up" again

It has been one year since large-scale model technology entered the field of vision, and the productivity improvement brought by it is obvious to all. However, the real phenomenal application has not yet been born, leaving a great opportunity.

Alibaba, Tencent, Baidu, Byte and other companies have announced that they will reconstruct their original applications with large models, and Pinduoduo has recently been exposed to join the ranks of self-developed large models. Recently, another exciting player has joined the fray.

On November 14, Alibaba's intelligent information business group released a self-developed quark model with hundreds of billions of parameters, which topped the two authoritative evaluation lists of C-Eval and CMMLU. Taking advantage of the trend of AI reconstructing applications, Quark large models will comprehensively upgrade their product matrix and services.

Jiang Guanguan, head of Quark Technology, said: "After the release of GPT last year, we accelerated the work related to large models, and the early model capabilities reached a relatively high level. Now that it is released, it is hoped that specific applications and experiences can be made on the product side before it is released. 」

Launched in 2018, the Quark app was designed to be a smart assistant for young people. At present, the Quark App has integrated a variety of functions such as search, scanning, network disk, and documents. With the blessing of the self-developed large model, Quark intends to take the lead in the fields of education and health and become a leading intelligent information product.

The large-scale model "Xueba" comprehensively surpasses GPT-3.5 in terms of higher education and vocational examinations

The large model released by Quark this time is a general large model with 100 billion parameters. Jiang Guanguan said that the overall level of the quark large model is better than GPT-3.5, and it is at the top level of the domestic industry in terms of multilingual translation, code writing, security compliance, and content creation.

Compared with the large models released by other companies, the quark large model has stronger knowledge correctness.

At present, no large model at home or abroad can claim to be able to completely get rid of illusions, but there are many technologies such as supervised fine-tuning (SFT) that can reduce the error rate of the model.

The quark model appears to be very confident in its ability to correct its knowledge. At a time when many manufacturers are silent about the hallucination rate, Quark took the initiative to give a number: 5%.

In very important areas such as health, quarks have been able to reduce the hallucination rate of Q&A content to 5%.

The real scene refines the large model, and Quark uses AI to "speed up" again

The quark model leads the way in its ability to solve hallucinations

To be able to do this, the quark model has really put a lot of effort into solving the illusion problem.

According to Jiang Guanguan, first of all, in the pre-training process of the model, Quark spent a lot of time and effort to check and align the accuracy of the data.

Second, there is human alignment. The quark large model has very high requirements for the accuracy of SFT manually labeled samples. Quark uses a very meticulous approach to conduct preliminary reviews, spot checks, and re-audits.

In these two points, the search engine ability of Quark has played a good supporting role in the construction of Quark large model. "We used to do general search and accumulated a lot of industry data. At the same time, because we have done search before, we need to have a system of understanding alignment and verification of web content, which can be well migrated to the alignment ability of large models. Jiang Guanguan said.

In addition, as a 100 billion model, the magnitude of the model parameters and the improvement of the quark team on the model itself also reduce the illusion problem.

In order to demonstrate its ability to be knowledge-correct, the quark model not only tested the large model on the common large model test list, but also ranked first in the overall score with an average score of 77.08 in the CMMLU list evaluation, and occupied the first place in the social sciences and two other categories. With an average score of 89 in the C-Eval list, the quark model topped the industry and topped the social sciences, humanities, and three other categories — and allowed the model to come to the real world and take a human test paper like a candidate.

The team at Quark Megamodel collected 45 exams for Quark from 2020 to 2023, including questions from the preliminary, secondary, high, and postgraduate exams, as well as various professional exams including the Certified Public Accountant and the National Judicial Examination. The overall performance of the quark large model is very outstanding, surpassing the level of GPT-3.5 as a whole and partially surpassing the level of GPT-4. There are 11 subjects with an accuracy rate of > 80%, which can be called a "top student".

Quark large model exam results

As a large model trained with high-quality Chinese corpus, the quark large model has an accuracy rate higher than GPT by 70% in exams with Chinese characteristics such as the college entrance examination language, the joint examination for education and learning, and the national civil service examination. At the same time, his English skills are still extremely strong - Jiang Guanguan said that the English test paper of the quark model is almost a perfect score.

AI upgrades "search and storage", and the fields of education and health take the lead

In many fields, the quark model emphasizes its efforts in the fields of education and health.

It is reported that more than 50% of Quark users are from young people under the age of 25. Before the launch of the quark model, the Quark App has accumulated many related applications in the field of education, such as Quark Learning, where users can see different learning content such as local test papers, test preparation tips, and typical questions after selecting the grade.

These all involve obtaining data from the entire education industry, including various materials, lesson plans, question banks, and knowledge points. And these high-quality education data in turn help Quark better complete the large model training stage, which is also one of the reasons why Quark large model can become a top student.

Compared with other large models, the quark large model pays more attention to how it solves the problem step by step and understands what knowledge points are tested in the question from the time of training.

In the live demonstration, Jiang Guanguan asked the large model a question, "What is the difference between in and on in English prepositions?" The large model first gave a paragraph about the difference in the definition of two words, the two prepositions have different meanings, different usages, and different emphases. In the following questioning of the large model, the quark model can further give two examples of words in the English context.

Quark technical head Jiang Guanguan

Jiang Guanguan said, "AI can basically teach my daughter English. In the future, such capabilities will be embedded in Quark's educational applications.

"At present, there are two main problems with large-scale model education. On the one hand, the reasoning and sorting ability of large models is not so good, and on the other hand, the ability of image multimodality is not well played in the field of education. Even OpenAI's model can't do geometry problems at the moment. Jiang Guanguan said, "According to the needs of users, we will first make a lot of AIGC content, upgrade the reading comprehension of existing documents and collect mistakes. On this basis, in the future, the model of professors in and on will already be a bit like a junior tutor, which is what we are working towards. 」

Health is another vertical that quark models are currently working on. Before the release of the large model, Quark did a lot of data construction and knowledge construction in the health industry.

Since the quark health data is the result of three reviews and three proofreadings by doctors, the quark large model, which emphasizes the correctness of knowledge, can reach 95% accuracy in health data, so the usability is stronger.

"The critical error rate is actually lower. The current error rate of 5% actually includes non-critical errors such as some confusing similar symptoms. Jiang Guanguan said.

In the health industry, Quark will provide health information query services, such as popular science Q&A. As with educational applications, quark models will pay more attention to how the large model arrives at its conclusions. After the user obtains a recommendation from the quark model, they will also be able to click to see which health guide and textbook the recommendation of the model comes from.

In the future, the quark model also hopes to build a more user-friendly service method, for example, around the health scene, after the user describes the symptoms, the large model can further ask the user whether the user has common related symptoms.

In addition to the special capabilities of the industry, the three core functions of the Quark App: search, use, and storage, have also begun to be upgraded in the direction of large models. For example, the function of "save", the AI natural language search function currently on the Quark network disk, can quickly find photos, documents and other cloud materials only through key information such as fuzzy words and adjectives, and the core comes from the ability of large models.

In the future, the quark model will be further applied to scenarios such as search, intelligent tools, and asset management assistants to provide more comprehensive services for young people to work, study, and live.

The natural advantage of the search engine team to make a large model

As soon as it was announced, it dominated the list of CMMLU and C-Eval, and has outstanding advantages in the fields of solving illusions, health and education, which is inseparable from the years of search experience of Quark App.

"When we were developing the big model, we were nervous at first. But soon we became convinced that the quark model would not be too bad in China. Jiang Guanguan said. Among them, the core reason is that Quark is a team from the search field, "we have a natural advantage in making large models".

This has also been repeatedly verified in the field of large models: Google and Microsoft, the leading foreign model and application trainers, have experience in making search engines.

Microsoft launches New Bing, a search engine based on a large model

Jiang Guanguan summarized the experience of search engines for the team to develop several large models:

The first is the advantage of data. The experience of being a search engine has allowed Quark to accumulate very comprehensive and high-quality data. To be a general-purpose search engine itself requires knowledge and data covering a wide range of industries, and even accumulating knowledge of English and other languages.

Not only that, the experience of being a search engine has also allowed the team to accumulate a set of evaluation system for the quality of web content. "Search engines themselves mean a huge amount of web data. We have selected hundreds of millions of particularly high-quality pages out of the hundreds of billions of pages, and this screening is particularly complicated. It is not a manufacturer of search engines, and the cost and cost are very high to complete this task. Jiang Guanguan said.

The second is the advantage of talents. In general search, web search, image search, video search, document search, etc., inherently require a variety of multi-model technical capabilities, and these talents can be transferred to the team of large models. It is reported that in order to realize the full-stack self-developed technical route, Quark has built an independent production and research team of hundreds of people.

Third, it is the advantage of computing power optimization. One of the major problems faced by large models is that online inference is too expensive. The team that has done search engines also has a better solution in optimizing large computing power requests. "Quark was able to serve hundreds of millions of online requests before. Jiang Guanguan said.

In addition to the experience of making search engines, Quark has some unique advantages in the development of large models.

For example, in the problem of scarcity and alignment of high-quality SFT data that all large model teams have to face, Quark has been deeply involved in the fields of education and health for a long time, and Quark has been able to obtain high-quality data that many other large model teams do not have.

"Not only do we have a good data accumulation in the industry, but in these industries, we have worked as doctors or teachers in our team before, and their main job is to produce professional medical knowledge. Once we started working on large models, we quickly built up a team of professionals to produce the SFT samples and body of knowledge needed for large models. In this regard, we are at the forefront of the country. Jiang Guanguan said.

At a time when many large models are intended to provide general services, the quark model was born from the beginning to be tailored for the Quark app, using Quark's unique data advantages to build Quark into a real intelligent assistant. Although the model has hundreds of billions of parameters, more attention is paid to the application of the model to the product, and the steady quark model may become a unique pole in the domestic large model.

The real scene refines the large model, and Quark uses AI to "speed up" again

The real scene refines the large model, and Quark uses AI to "speed up" again

Read on