iFLYTEK Liu Cong: The big model is a system, which is included in the whole stage of the "721 model"

author：Financing China 2023-09-29 12:11:00

In this wave of AI, iFLYTEK has been at the forefront.

As one of the earliest manufacturers in China to invest in AI, iFLYTEK stepped on the right rhythm, which made it rapidly increase its popularity in the "100-model war". In particular, iFLYTEK has announced that the iFLYTEK Spark Cognitive Big Model is open to the public.

As a well-known AI manufacturer in China, iFLYTEK has always been a leader in the domestic AI field with its profound technical accumulation and continuous innovation spirit. The opening of iFLYTEK's Spark cognitive model to the whole people marks another big step forward in iFLYTEK's concept of openness and inclusiveness in the field of AI.

For the investment in innovation, iFLYTEK has always spared no effort. In the past ten years, iFLYTEK's annual R&D investment has accounted for about 20% of its revenue.

At the level of technology research and development, iFLYTEK adheres to the "721 model": 70% is invested in current leading products, 20% is invested in strategic new products, and 10% is invested in forward-looking, exploratory research and development that does not pursue must be rewarded. According to Liu Cong, Dean of iFLYTEK Research Institute, the big model is a system that exists at different stages, so it is included in the whole process of the "721 model".

According to this magazine, Liu Cong was born in 1984, joined iFLYTEK after graduating from the University of Science and Technology of China, and is now the president of iFLYTEK Research Institute, which manages a team of 1,000 people. At the 2023 World Artificial Intelligence Conference, iFLYTEK was elected as one of the six co-leader units of the National Artificial Intelligence Big Model Standardization Task Force, and Liu Cong served as the leader on behalf of iFLYTEK.

In iFLYTEK, the main force responsible for AI research and development is the iFLYTEK Research Institute led by Liu Cong.

The following is the content of this magazine's interview for readers:

This journal: For the usability and usability of this upgrade of iFLYTEK Spark 2.0, from your point of view, how much do you think you can score?

Liu Cong: Let me focus on the code capabilities of this upgrade. Let's first explain why at the press conference on August 15, we first demonstrated the code capabilities and related products of iFLYTEK Spark. Mr. Liu (Liu Qingfeng, chairman of iFLYTEK) first released the general capabilities of iFLYTEK at the press conference on May 6, and in the dimension of universal capabilities, there is an important ability is code capabilities.

According to the code capability public test set HumanEval built by OpenAI, the effect of the Spark V1.5 Python language is only 41 points, but V2.0 has reached 61 points, close to ChatGPT. According to the test set of code real-world scenarios constructed by the National Key Laboratory of Cognitive Intelligence, iFLYTEK Xinghuo Cognitive Big Model V2.0 has surpassed ChatGPT in the dimension of code generation and completion. These objective indicators can be presented to everyone.

Therefore, we can stand in the scene and from the user's point of view, based on the code capability improvement launched intelligent programming assistant iFlyCode 1.0 around the scene functions required by the programmer to design, and continue to optimize the product iteration.

This journal: iFLYTEK Spark 2.0 only opened the closed beta on August 13, in this process, did there be some problems that were not expected before, resulting in some difficulties in the middle, how did iFLYTEK solve it?

Liu Cong: First, the improvement of code capabilities is actually more difficult than everyone thinks. If I want to improve the capabilities of my code, it may affect the other capabilities of the large model. Because iFLYTEK Spark is a unified basic large model to achieve all functions, this is the goal of benchmarking general artificial intelligence.

Second, this time focuses on upgrading multimodal capabilities. Perhaps everyone's understanding of multimodality is not particularly deep, GPT-4 released multimodal understanding on March 14 this year, but it has not been fully released so far.

The multimodal capabilities we demonstrated on August 15 are not bad, but if you try some very complex graphs, iFLYTEK's understanding may not be so good. If you want to truly align speech, image, video and other modalities into a unified semantic space and realize the penetration and connection of semantics, this is a very complex problem.

Further, what I often said before is called systematic innovation, a large model of multiple modalities and types, where is the ultimate goal? How to implement a variety of functions, how to connect and integrate various modules, especially the multi-modal capability with more complete functions, imagine that it is more complicated.

Of course, the final result as a whole is still in line with expectations, and there may be a partial adjustment process in the middle. In the end, everyone withstood the pressure and was able to take out the phased results of such an overall product now, I think I am personally proud, of course, many parts also need to be continuously optimized.

Journal: Basic big model capabilities convergence, what do you think about this problem?

Liu Cong: I would like to answer from two aspects. First, don't look at what people say, watch what people do, don't just listen to something, but experience.

The sentence emphasized by Chairman Liu Qingfeng at the press conference, I think is very accurate, called words have substance, this is a very key point. Even if, like because of text generation, objectively speaking, different people's requirements for text generation are different, we still have to objectively see the difference between the specific task and the current model with the best effect.

Second, I think it depends on what level we look at this matter. For iFLYTEK, we went to the goal of general artificial intelligence from the beginning. What aspects of the ability to break through to what extent, what level can achieve practical application, we pay more attention to whether the landing can produce value.

This journal: iFLYTEK Research Institute has a technology research and development strategy is "721", that is, 70% to invest in the current business that supports the company, 20% is strategic new products, 10% is forward-looking technology, and does not pursue returns. The big model is 7 or 2 or 1 for our current situation? How have the big models changed in the past few years in our research strategy?

Liu Cong: I personally understand that the large model may contain "721", which is a base and a frame for us in the future. First of all, when the entire training process of the large model is through, it is necessary to cover more general capabilities and improve more scenarios, which may be part 7.

Of course, we generally choose industries with comparative advantages to find some expansion application scenarios, and continue to deepen multimodal capabilities to improve, I may classify it as 2, of course, there must be a part of 1.

So, my understanding of this problem is that the big model is a system, and it has it at different stages.

How will intelligent programming change the role of developers? What are the core competencies in the future?

Liu Cong: I think there are at least two dimensions.

The first is to continue to improve the ability of developers to focus on programming development.

Second, helping to improve efficiency in the daily work of developers can free everyone's energy to do creative work such as improving productivity and unleashing imagination.

iFLYTEK Liu Cong: The big model is a system, which is included in the whole stage of the "721 model"

Read on