laitimes

Zhihu big model "Zhihai Chart AI" is online! The official announcement of the product is the internal test, and the "hot list summary" is won for 400 million users

Yang Jing is from the Temple of Concave Fei

Qubits | Official account QbitAI

Another domestic enterprise large model product launched.

Nothing else, but Zhihu, the largest Chinese Q&A community with 400 million users.

And the official announcement is the internal test -

Not only will there be the first large-language model "Zhihaitu AI", but the first product will also be applied to the hot list.

Zhihu big model "Zhihai Chart AI" is online! The official announcement of the product is the internal test, and the "hot list summary" is won for 400 million users

Reasonable, unexpected.

On the one hand, Zhihu has the advantages of natural large models, scenarios and applications, and the most important thing is a natural large-scale, high-quality Chinese data pool. NewBing also sees it as one of the Chinese data sources, and its share price soared nearly 50%.

This kind of advantage is rare in China, and this product release can be regarded as the beginning of a thousand calls.

But on the other hand, in the cognition of many researchers, especially scientific researchers, Zhihu, as a knowledge Q&A sharing platform, every outbreak of technological revolution is here to watch and witness.

Just as ChatGPT-related topics have broken the popularity of AlphaGo discussions that year, with 400 million views and nearly 240,000 discussions.

Zhihu big model "Zhihai Chart AI" is online! The official announcement of the product is the internal test, and the "hot list summary" is won for 400 million users

As for the relevant AI technology and layout behind Zhihu, it is not known by most people.

Now, Zhihu has taken the initiative to share everything.

And with the release of the product, Zhihu's layout on the big language model also surfaced for the first time.

Zhihu large model product official announcement is internal testing

At the press conference, Zhihu also released the latest demo of the product form of the "Hot List Summary", so that friends who are waiting for the internal test can take a look first~

What can be seen is that the "watching the mountain" assistant will appear below the question of the hot list.

Then it will grasp the important ideas of those high-quality questions and answers, and after being sorted, aggregated, and polished by AI algorithms, it will present the answer outline to the user.

Zhihu big model "Zhihai Chart AI" is online! The official announcement of the product is the internal test, and the "hot list summary" is won for 400 million users

In this way, you can get key information while looking at popular issues, and the efficiency is directly full.

The big language model CPM-Bee behind this comes from Tsinghua large-model startup Face Wall Intelligence, which has attracted much market attention.

According to Li Dahai, co-founder and CTO of Zhihu, CPM-Bee is the Chinese language model with the best performance in the current field of vision.

Zeng Guoyang, co-founder and CEO of Face Wall Intelligence, also gave the official internal test performance:

In the content aggregation scenario, 28 of the 41 questions performed flat. Basically flat compared to GPT-4.

Zhihu big model "Zhihai Chart AI" is online! The official announcement of the product is the internal test, and the "hot list summary" is won for 400 million users

As one of the earliest companies in China to carry out relevant research and exploration, the founding team came from the Laboratory of Natural Language Processing and Social Humanistic Computing (THUNLP) of the Department of Computer Science of Tsinghua, and professors Liu Zhiyuan, Sun Maosong and Liu Yang were their co-founders and advisors. Therefore, he has rich experience in the transformation of large model learning and research, development and landing.

In the area of industry-university-research transformation, the team was the first to propose a knowledge-guided pre-training model ERNIE, focusing on the hot topics of model pre-training, improved learning, and efficient parameter fine-tuning, and they also published dozens of papers at the international top conference.

They have also developed a number of open source large models, such as: the first Chinese large model CPM-1 in China, efficient and easy-to-use large model CPM-2, controllable continuous large model CPM-3...

In addition, vertical fields such as law and biomedicine have also developed proprietary dedicated large models. At the beginning of its establishment, it has reached cooperation with leading customers in the legal, automotive, home appliances, media and other industries, and completed nearly 10 million seed round financing.

Just recently, Face Wall Intelligence has just received angel round financing led by Zhihu and co-invested by Zhipu AI. According to the two parties, this investment cooperation aims to realize the value co-creation of the superior resources of both parties and jointly explore the upper-layer application of large-scale language models.

From this point of view, the layout of Zhihu large model has also surfaced: invest in large model companies and jointly create large model applications.

It is revealed that the relationship with the face wall intelligence is deeply integrated, and it is the kind that you have to see every day.

Next, on the basis of CPM-Bee, with more feedback and iteration, the new model will be gradually applied to more scenarios after having stronger logical reasoning capabilities and faster training and reasoning speed.

Such as creation, discussion forums, information acquisition, etc.

In fact, this path is not unfamiliar, as Microsoft did with OpenAI. Microsoft's product matrix perfectly fits the landing scenario of ChatGPT, and can feed back the ability to iterate large models while being applied. Therefore, it is the deep integration of the two technologies and applications that has shaken the world in search engines, productivity and production and life, so that enterprises and individuals can enjoy the potential and possibilities brought by AIGC.

The question that follows is -

Why take such a path?

At present, the development of large domestic models is far from being described as hot. This opportunity, considered ten times greater than any previous change, is something no business or institution wants to miss easily, as evidenced by the new developments that have followed in recent weeks.

It is undeniable that Zhihu laid out a large model at this time and chose a path that suits him most -

In the words of Zhihu CEO Zhou Yuan, he is the developer of new productivity in the AI era and the creator of new scenarios.

Zhihu big model "Zhihai Chart AI" is online! The official announcement of the product is the internal test, and the "hot list summary" is won for 400 million users

The reason for this has to be disassembled from the development of domestic large models.

The first "China AIGC Industry Panorama Report" shows that the development of domestic large models can be roughly divided into three paths: infrastructure layer, model layer and application layer.

Among them, the model layer has become a key bayonet for the current development, which limits the development of the upper and lower layers (infrastructure layer, application layer) to a certain extent.

As for whether the development of the model layer is good or not, in the final analysis, it mainly comes from the two aspects of computing power and data: computing power is the hardware foundation that supports the training of large language models behind it, and data is the key that directly affects the strength of model capabilities and even the quality of generation.

Especially Chinese data, on the one hand, the essential reason, Chinese relatively complex in English, technical difficulty is high; On the other hand, foreign English datasets are richer and of higher quality. However, the domestic Chinese corpus is not perfect, and if necessary, various companies need to clean, which consumes human and financial resources.

Zhihu big model "Zhihai Chart AI" is online! The official announcement of the product is the internal test, and the "hot list summary" is won for 400 million users

And this happens to be related to the unique advantages that distinguish Zhihu from other platforms.

We all know that the effectiveness of a model depends on both the quantity and quality of the data. This seems to be able to do both.

In terms of quantity, the financial report for the third quarter of 2022 shows that the amount of content in the Zhihu community has accumulated more than 579 million. According to the 2022 annual financial report, the number of questions and answers has accumulated to 506 million, covering more than 1,000 vertical fields.

Especially on some professional issues, it is even more obvious.

Zhang Ning, vice president of strategy and head of community business of Zhihu, revealed such a set of key data:

The total number of people engaged in scientific research, study and work in the station is as high as 5.44 million. In the field of scientific research and the Internet alone, the average daily production of graphics and text is more than 20,000.

The number of answers, articles and videos in mathematics, physics, astronomy, artificial intelligence and other fields has exceeded 1 million.

In addition to quantity, the quality of data is particularly critical.

At the beginning of the release of ChatGPT, there were often outrageous and wrong answers. "A serious nonsense" is the first impression that ChatGPT left on everyone.

Behind this is actually related to the quality of the training data, and the dataset is mixed with a lot of mixed content.

In Zhihu, the discussion of many professionals and the screening of the Q&A mechanism constitute the high quality of content data, and even some Zhihu content has been directly published in books.

Some time ago, NewBing was just released, and many netizens found that some of the sources of answers came from Zhihu.

Zhihu big model "Zhihai Chart AI" is online! The official announcement of the product is the internal test, and the "hot list summary" is won for 400 million users

Zhou Yuan's split of productivity elements in the AI era is mainly divided into three layers: application scenarios, proprietary data, and basic models. The discussion field based on Q&A is a natural application scenario. The content, relationships, and knowledge graphs that are constantly generated are unique and proprietary data.

The basic model layer represented by GPT is developing rapidly, and combined with Zhihu's application scenarios and proprietary data, it can promote the rapid application of large models. At the same time, Zhihu's professional scenarios can also feed back the iteration of large model technology.

In fact, Li Dahai also revealed that Zhihu is also cooperating with various types of companies to use its own unique advantages to promote the development of domestic large models.

In addition to the consideration of the situation, behind this is also a natural choice to return to the essence.

At the Zhihu Discovery Conference, Zhou Yuan once again talked about the content values of the Zhihu community that have always been "gaining" -

Let everyone better share knowledge, experience and insights to find their own answers.

Zhihu big model "Zhihai Chart AI" is online! The official announcement of the product is the internal test, and the "hot list summary" is won for 400 million users

He believes that AI will eventually serve people, empower people, and is an expansion of human capabilities.

Therefore, in the specific scenario of Zhihu, human-machine co-creation can help creators better exert their creativity and improve the efficiency and quality of content creation, so that more users can be helped and broaden their horizons.

Under the wave of large models, many application scenarios have been mentioned. Zhihu also stepped into the game as the creator of the new scene to explore more value.

Looking back at every technological change in the past, millions of practitioners in China have learned and discussed, responded and debated here through Q&A, topics, round tables, ideas, columns, live broadcasts, etc.

Therefore, to some extent, Zhihu, as a key medium, has played a role that cannot be ignored in the development of domestic cutting-edge science and technology.

Especially in this global ChatGPT storm, the experience is particularly obvious, with 400 million views and more than 239,000 discussions on related topics.

Mr. Ng Enda blogs here every week, calling on everyone to look at this wave rationally; Yuan Jinhui, the founder of first-class technology who was acquired by Wang Huiwen and is on the cusp of the storm, is looking for answers in Zhihu...

Many ChatGPT derivative products were first released here: ChatExcel launched by the Peking University team, the first public benchmarking ChatGPT open source project ChatRWKV, and the first domestic ChatGPT detector... The developers behind it also responded and personally answered the doubts of netizens.

A group of researchers, entrepreneurs and practitioners gather and connect here, break the barriers of time and space, explore cutting-edge trends in the first time, and then promote the development of domestic cutting-edge science and technology.

Only now and in the future, Zhihu will use its accumulated advantages to contribute to the development of China's big model in a more obvious way.

Read on