iFLYTEK Spark has been renewed, and the "Super Knowledge Assistant" has been launched, jumping out of the "long text" melee

Written by Chen Dengxin

Editor / Li Ji

排版/ Annalee

On April 26, 2024, iFLYTEK's Xinghuo large model V3.5 will be launched in the spring, and the one-sentence sound reproduction function will make science and technology more warm, the Xinghuo intelligent twin platform will be launched to help enterprises solve the "last mile" problem of large model applications, and the iFLYTEK Xinghuo V4.0 will be officially released on June 27......

Among them, iFLYTEK Xinghuo has become the industry's first large model to support "long text, long graphics and text, and long voice", which successfully grasps the pain points of users' efficient and accurate knowledge acquisition, which is particularly eye-catching.

In contrast, the comparison of the length of long texts "the first in the world" has become boring.

Today, why does iFLYTEK want to make a large model of long text, long graphics and long voice? What is the quality of the large model of long text, long graphics and long voice? What is the confidence of iFLYTEK to break into the finals of the large model?

"Long text" competition, entering the 2.0 era

ChatGPT was born, giving birth to the "100 model war".

Under the contention of hundreds of schools, the industry is also thinking about the value of large models, and it has become a common demand from early adopters to practicality, so the application has become the "main battlefield" of the large model game.

However, the main energy of the "big manufacturers" is on the B-side, through the way of empowering the industry, to achieve symbiosis, co-prosperity and win-win.

Correspondingly, the degree of attention paid to the C-side is not as high, and the demand for efficiency improvement has not been completely met for a long time, so that "the time to modify AI-generated copywriting is no less than what I thought from scratch" has become a resonance.

It wasn't until the advent of "long texts" that there was a subtle change.

After all, the time taken to manually read long texts is calculated in hours, but the time taken by large models is calculated in seconds, and the efficiency improvement is visible to the naked eye, and the efficient acquisition of C-end knowledge moves from dream to reality.

According to public information, the text range of GPT-4Turbo-128k is about 100,000 Chinese characters, and Claude3200k is about 160,000 Chinese characters, and the domestic large model led by Kimi continues to roll in, and the long text processing capacity has climbed from 200,000 Chinese characters to more than 10 million Chinese characters, staging an "arms race".

As a company that understands both the B-end and the C-end, iFLYTEK has a different view.

iFLYTEK analysis found that in the process of knowledge acquisition and learning, the information that the majority of users can get is often not only ready-made long texts, but also the content of newspapers and books, PPT content of various seminars, board books on the teacher's blackboard, students' notes, as well as various meeting recordings, interviews, various online press conferences, training and education videos, etc., how to upload these texts, pictures, voices, etc. to iFLYTEK Xinghuo, you can quickly obtain full-dimensional knowledge.

In layman's terms, iFLYTEK jumped out of the inherent thinking of the battle for long texts, and carried out dimensionality reduction strikes through multi-modality, truly facing the multiple scenarios of efficient knowledge acquisition for users, and getting rid of the current involution of "long texts".

In this regard, Liu Qingfeng, chairman of iFLYTEK, said: "We can see from the application of the Xinghuo APP that the peak of use is not on weekends, but on weekdays, and the peak time of use is at 9:30 a.m. and 3:30 p.m. on weekdays, which means that the vast majority of users are solved by our iFLYTEK Xinghuo to solve work-related problems." ”

Qimai data shows that the number of downloads of the iFLYTEK Xinghuo APP on the Android side has exceeded 96 million, ranking first in the domestic tool category of general large model APP.

From usable to loved, find just needs from the scene

It can be seen from the above that continuing to use technological progress to solve real rigid needs is the key to iFLYTEK Xinghuo's recognition by users, and it is also in line with the purpose of "liberating productivity and unleashing imagination" that iFLYTEK model has always adhered to.

In fact, iFLYTEK's long text, long graphics and long voice models can be called the "artifact" of efficiency improvement for people in the workplace.

On the one hand, long text processing is more professional.

Although more and more large models support long text processing, the gold content is not the same, and the reason for this is closely related to the use of RAG (Retrieval Enhancement) algorithms.

An industry insider told Zinc Scale: "The so-called RAG algorithm can be simply and crudely understood as the long text is split into multiple short texts and then processed, thereby reducing the technical threshold, spelling length is very advantageous, but the ability to capture the context is relatively insufficient, which reduces the processing efficiency, and is at a disadvantage in accuracy, coherence, and reliability." ”

The above-mentioned industry insiders further said that the RAG algorithm meets the available standards and is suitable for some work scenarios that do not require high knowledge accuracy, and requires users to manually check it again, while the lossless algorithm can capture the context content completely, so as to understand long texts more accurately and meet the easy-to-use standards.

iFLYTEK Xinghuo has gone one step further and reached the standard of ease of use and love to use.

iFLYTEK Xinghuo's general long text capabilities, including long document information extraction, long document knowledge Q&A, long document summary, long document text generation, etc., are generally close to GPT-4 Turbo, and in terms of knowledge question and answer tasks in various vertical fields, the overall level of long text of Xinghuo large model has surpassed GPT-4 Turbo.

More importantly, with the help of sparse pruning technology and knowledge distillation technology, the industry's best large model with 13 billion parameters was launched, and the effect loss was only less than 3%, which made Xinghuo achieve great efficiency improvement in document upload and analysis processing, knowledge Q&A first response time and text generation.

The test shows that under the condition of ensuring the effect of long text, whether it is 10K, 64K, 128K token, or longer text, the performance of the Xinghuo large model is the best in the industry.

In this way, even scribbled handwritten text can be recognized, which was a pain point for Kimi.

On the other hand, innovation aims at rigid needs.

As a latecomer, it is not enough to be more professional, but also to have something unique to catch up, iFLYTEK finds the rigid needs from the scene, and then achieves the purpose of innovation by meeting the rigid needs.

As a result, long graphics and long voices make iFLYTEK Xinghuo occupy the competitive advantage of "no one else has me".

What's more, long text, long graphics and long voice promote each other, the application scenarios have been greatly expanded, and the landing of iFLYTEK Xinghuo has also gained a larger incremental field.

For example, in daily life, we often encounter lengthy purchase contracts, insurance contracts, etc., which become a long-term pain point if we can't understand, can't read completely, and can't see all of them.

For example, long text superimposed on long speech can help improve the efficiency of transcription and the ability to sort out chapters, making it more convenient, easier and more worry-free for teachers to prepare lessons and students to review.

In addition, the iFLYTEK AI learning machine is the world's first cognitive large model AI learning machine, and the superposition of long pictures and long voices can improve the intelligent auxiliary learning ability of the AI learning machine, giving English oral sparring, Chinese-English composition correction, mathematics interactive supplementary learning, encyclopedia free question and answer, parent-child education assistant, etc. stronger interactivity, increasing children's interest in learning, and further releasing children's creativity, inspiration and imagination.

In 2023, benefiting from iFLYTEK Xinghuo, the GMV of C-end hardware products such as iFLYTEK AI learning machine, iFLYTEK smart office notebook, iFLYTEK intelligent voice recorder, and iFLYTEK intelligent translator will achieve an 84% growth.

It can be seen that the "chemical reaction" of long text, long pictures and texts, and long speech solves the rigid needs of users to obtain knowledge more efficiently in all scenarios.

Imagination becomes productivity, and the computing power base is the key

It is not difficult to see that iFLYTEK has pointed out an industry direction for the large-scale model game: to avoid ineffective "involution" and return to the "main channel" of technological innovation, you can stage a good show of overtaking in corners.

After all, technological innovation is the greatest productivity.

To turn imagination into productivity, it is inseparable from iFLYTEK's long-term hard work in basic skills and the consolidation of the base of the large model, so that it can run faster and farther.

In short, computing power is the foundation of the large model, and it is an important guarantee to support long text, long graphics and long voices.

Coincidentally, iFLYTEK has always insisted on doing difficult but correct things: compared with the "big players" in the industry, iFLYTEK's funds are not outstanding, but it has persistently increased its computing power and become one of the few AI companies with a large model base.

According to the financial report data, iFLYTEK's R&D expenses in 2023 will be 3.839 billion yuan, a year-on-year increase of 11.89%, and the annual net profit will only be 657 million yuan, and the R&D expenses will be 5.84 times the net profit.

It is worth mentioning that the computing power base of iFLYTEK is independent and controllable.

In October 2023, iFLYTEK and Huawei jointly released the first 10,000-card domestic computing platform "Feixing No. 1" that supports trillion-parameter large model training, and through bandwidth utilization improvement and parallel training algorithm optimization, iFLYTEK Xinghuo has achieved 90% of the computing power of NVIDIA A100 on Huawei's 910B chip, and even surpassed NVIDIA in some specialized capabilities.

In this way, the iFLYTEK Xinghuo large model V3.5 has become the first large model with completely independent intellectual property rights for national computing power training, and is not afraid of the risk of "stuck neck".

Under the strong combination, iFLYTEK ranks among the first echelon of large models.

In this regard, it can be seen from the 27th United Nations Science and Technology Conference that has just ended: iFLYTEK and dozens of well-known enterprises at home and abroad such as OpenAI, Google, and Microsoft jointly participated in the compilation of two international standards, "Security Testing Standards for Generative Artificial Intelligence Applications" and "Security Testing Methods for Large Language Models".

All in all, iFLYTEK is based on the computing power base of the large model, which is on par with the most advanced capabilities in the world, so as to incubate a large model of long text, long graphics and text, and long speech, and has established its leading position in the large model in one fell swoop with more professional word processing, richer application scenarios, and easier to meet user needs.

Then, the "spark" of iFLYTEK is "burning the prairies".

iFLYTEK Spark has been renewed, and the "Super Knowledge Assistant" has been launched, jumping out of the "long text" melee

"Long text" competition, entering the 2.0 era

From usable to loved, find just needs from the scene

Imagination becomes productivity, and the computing power base is the key