Is China an AI power or an AI power?

In the field of deep learning frameworks, Flying Propeller has made remarkable achievements, breaking the monopoly of Google and Meta.

”

Author | Dai Congfei

Edit | Hu Zhe

Recently, Stanford University released the "2022 AI Index Report". The more than 190-page report covers ai development outcomes in R&D, economics, education and more.

According to the report, in 2021, China's AI patent applications accounted for 52% of the global total, and the number of patent applications ranked first in the world. However, in terms of the number of granted patents, it still lags behind the United States.

This result is not surprising, as the birthplace of computer science, the United States has always been unique in the field of AI, and it is difficult for other countries to match it. But it is also an indisputable fact that China is catching up.

In the context of overall backwardness, how can Chinese AI catch up? Is it a single-point breakout or a multi-point blossom? Whether it is a breakthrough in the field of commercial landing or a competition in the perspective of engineering and technology, these issues have yet to be explored.

However, in the deep learning framework/platform, which is based on the underlying innovation and serves the junction of industrial landing, it seems to provide us with a good perspective on how Chinese AI can catch up.

It is an AI power, but it is not yet an AI power

In 1956, at a quiet university in the small town of Hannos, a group of well-known scientists, including Nobel laureate Economist Herbert Simon, gathered for a lengthy meeting to discuss a topic that seemed impossible to people at the time: the use of machines to imitate human learning and other aspects of intelligence.

This became known as the Dartmouth Conference, which lasted for two months, and scholars still did not reach a consensus, but gave a name to the discussion: artificial intelligence (AI).

In the past 66 years, many of the topics discussed may have become a reality, and AI has come from logical reasoning and expert systems to the third wave of development of machine learning and deep learning, ushering in an explosive period of development.

Objectively speaking, the United States, which was the first to put forward the theory of AI, still dominates in the entire globalization trend of AI development, in the fields of basic technological innovation and commercialization, and has an unparalleled level in academic research and practical application.

However, China also has unique advantages.

Many people know that the three elements of AI research and development are algorithms, computing power and data, but this is only the element in the technical sense. In this regard, China has a huge domestic market of 1.4 billion people, many world-class super-large-scale Internet platforms, and a large number of new infrastructure needs from the transformation and upgrading of traditional industries, which determine that in addition to the United States, no economy has the foundation to compete with China to become a world-class source of AI innovation, including the technologically developed European Union.

Indeed, we have a lot of good policies about AI, as well as good data and achievements.

For example, as early as 2016, the term "artificial intelligence" was already written into the mainland's 13th Five-Year Plan. After this, relevant favorable policies have been frequently issued, and the development of AI companies has entered the fast lane, and the number and amount of financing have grown rapidly.

For example, since 2016, the average annual financing event in China's AI investment field has been about 1,000. In the past 2021, there were 1132 related industry investment and financing events in China, with a cumulative amount of 399.64 billion yuan, an increase of 51.44% over 2020, a new high.

For example, in the most core computing power level of AI, China's development speed cannot be ignored. Some reports show that in the past year, the hash rates of various countries have improved, but China has increased the most, entering the ranks of the global leader with a total score of 70 points.

Also, China is probably the country that publishes the largest number of AI conference publications, and has become the world's first AI patent "on the book".

But that's no reason for our blind optimism.

After all, AI must be applied to exert practical effectiveness.

According to the "2022 AI Index Report" released by Stanford University, the United States accounts for 40% of the global total number of granted patents, ranking first in the world.

The implication of this data is that obtaining a patent grant "proves that your patent is actually credible and useful." This is somewhat similar to the number of publications and the number of citations. That is to say, there are many research results in China, but there is still a certain gap between its influence and the United States in the real landing.

In part, of course, this is because the United States has a first-mover advantage in AI and the entire field of computer science.

But to some extent, even in the United States, AI is a very small number of forward-looking scholars, in the case of the public generally can not see the potential of AI, adhere to the results of hard work, the result of the creation of the road blue wisp, this spirit is worth learning.

Looking back at history, the development of deep learning has actually gone through a path from marginalized topics to mainstream technologies. The "godfather" figure Jeffrey Hinton's exploration of neural networks dates back to the early 1980s, when AI was not only a marginal, but also a trough. But it is the efforts of a few people, such as Hinton, to bring neural networks into the boom of research and application, and to turn "deep learning" from a marginal topic into a core technology relied on by Internet giants such as Google, making the development of artificial intelligence as hot as it is today.

It is worth mentioning that Hinton realized as early as 2013 that enterprises may provide better AI research scenarios and data and computing power than schools, so he entered Google in 2013, which promoted the productization of a series of AI technologies.

But for China, which aspires to become a global hub for artificial intelligence, the catch-up also began at the same time.

Why a deep learning framework?

On the surface, 2015-2016 was a watershed year for AI technology to enter public perception.

At the end of 2015, Google released TensorFlow, which has so far occupied the mainstream framework of deep learning, and its supporting AlphaGo defeated world Go champion Lee Sedol 4:1 in the human-machine battle in March 2016. The concept of deep learning has thus begun to be known by more and more "laymen", and the update iteration is also changing with each passing day.

Jeffrey Hinton once said in a speech: "Deep learning was not successful before because it lacked three necessary prerequisites: enough data, enough computing power, and set the initialization weights." Now, these difficulties are gradually being erased.

There is a famous saying in the industry that search engines are the largest AI projects in existence. This has been confirmed in both China and the United States.

Back in 2011, Google had incubated a project called DistBelief through Google Brain. Subsequently, a large number of scientists and engineers, including Jeffrey Hinton, transformed it to have tensorFlow, which later became famous.

Coincidentally, in the Chinese industry, the earliest to give birth to artificial intelligence, especially deep learning frameworks, is also a search engine company.

In the existing records, Baidu's spontaneous application of artificial intelligence technology can be traced back to 2006, after the strong rise of this wave of deep learning, Baidu is also the first in China to "see" the potential of deep learning technology and applications, to some extent, Baidu is also one of the early deep learning pioneers. For example, in 2013, Baidu took the lead in establishing the world's first deep learning research institute focusing on deep learning research.

The focus here is why Baidu has gradually chosen a deep learning framework/platform as a core breakthrough in AI research and development.

In fact, Baidu's earliest application of artificial intelligence is not completely top-down, on the contrary, there is a certain bottom-up trend, artificial intelligence is like a fire, in Baidu's different systems, architectures, products, gradually appear at different levels of application.

It can be said that the deep learning framework is the starting point for the vast majority of people to use artificial intelligence, and further on, it is through the manual construction of models, which is a patent belonging to some scientists and senior engineers, and they are too difficult to promote.

At that time, Baidu's internal, not only using different sources of early deep learning frameworks, but even different departments began to study deep learning frameworks on their own.

"R&D goes with the business" is also a norm. But the spark of deep learning has aroused the attention of Baidu's senior management.

In the super-large Internet platform, it is common for different businesses and departments to use different technical bases, and the department wall is also difficult to penetrate. But Baidu made a decision this time to unify the technical base of deep learning into a framework to achieve a breakthrough in the focus of resources.

To this end, Baidu sorted out the needs of various departments, to some extent, the needs of these departments actually represented the highest level of demand for AI applications in the Chinese industry at that time, and bringing together and designing a framework that can accommodate these needs can solve the problem of many enterprises and industries lowering the threshold of AI applications.

Compared with many frameworks from universities, with a long history and twists and turns, Flying Propeller laid the foundation for an "industry-grade" deep learning framework from the beginning.

Based on the accumulation of existing technology, Baidu officially opened the PaddlePaddle framework in 2016; and three years later, in April 2019, PaddlePaddle officially released Chinese name - Flying Paddle.

In the United States, in 2018, TensorFlow accounted for the largest proportion of activity on GitHub, search volume on Google, the number of articles on well-known technology media Medium, and the number of papers on arXiv.

In the same year, also in 2018, the Caffe2 code was merged into PyTorch, the two deep learning frameworks supported by Facebook were merged into one, and the development of PyTorch entered the fast lane, and today, PyTorch has formed an absolute advantage in the academic paper circle. According to statistics, 85% of the models on the Hug Face are PyTorch exclusive.

Baidu resolutely made an important decision after observing the respective strengths and weaknesses of these two world-class frameworks.

Why PPT?

Flying paddles can become the third pole of the world's deep learning framework outside of Pytorch and TensorFlow, and true cornering overtaking comes from a major decision.

The two major frameworks of the United States are popular in the academic circle and one in the industry, and the flying oars must form a differentiated victory, and as far as possible to concentrate the elite of academia and industry into an ecology, only to take a different path - from a simple industry-level framework, to a general framework that opens up the barriers between industry and academia, which is both industrial-level and deeply welcomed and embraced by the academic community.

In order to start everything from reality, flying propeller R & D personnel often go deep into the QQ group to accept the problems feedback from developers and solve them in a timely manner. This low posture of putting the needs of developers first has not only helped the flying paddle to gain the support of many developers, but also promoted the rapid development of the flying propeller. By the end of 2021, PaddlePaddle has brought together 4.06 million developers, created 476,000 models, and served 157,000 enterprises.

From the perspective of market share, the report released by IDC shows that in the Chinese deep learning platform market, in the first half of 2021, Baidu's comprehensive share continued to grow, surpassing Google and Meta (Facebook) and leaping to the first place.

The "2021 China Open Source Annual Report" also mentioned that in the top 30 GitHub China project activity in 2021, the flying paddle occupied 5 projects, of which the flying paddle framework ranked first.

This makes mainland AI technology developers and users no longer rely on foreign platforms, and also lays a solid foundation for further cultivating an independent and controllable AI development and application ecosystem, and is a very prominent world-class achievement in the basic field of computer science in China.

At this point, paddle PaddlePaddle, PyTorch, and TensorFlow have formed a three-strong situation, and the deep learning framework has entered the "PPT" era.

But that's not the end of the story.

With the maturity of the theoretical research of deep learning and the rapid iteration of the deep learning framework, the application and popularization of AI technology have entered an accelerated period, but there are still many pain points in the process of specific practical application.

For example, the production cost of artificial intelligence models is still very high, and there are various difficult and complicated diseases such as adaptation in the actual application of enterprises. In this regard, the flying propeller provides developers with a set of full-process guidance in the model library. From pre-adaptation to post-operation, the flying propeller provides a corresponding solution.

Specifically, in the early stage of data processing and model selection, Flying Propeller can help enterprises choose a suitable scenario in their own way. After that, the flying propeller also tracks the efficiency of deploying chips, providing quick feedback guidance when there is a problem of low accuracy.

In other words, this set of guidelines is not just an academic algorithm, but a link that is truly tailored to the needs of the industry.

In addition, due to the special requirements of some industries, such as the very high speed requirements of part quality inspection, it is difficult to achieve the ultimate in speed and accuracy at the same time with limited computing power. To this end, in view of these pain points, Feipao has designed the PP series model through the optimization of the algorithm model to achieve a balance between accuracy and performance.

As of 2021, Flying Propeller has released 13 PP series models, and the official algorithm model library has exceeded 500. In the process of training these models, the flying oar summarized its own set of methodologies, to a certain extent, speeding up the training.

Ma Yanjun, general manager of Baidu's AI technology ecosystem, told Leifeng that a long-term research and development problem to be solved by the deep learning framework is to improve the training effect. To this end, the flying paddle has done a lot of work to improve the training performance.

According to Ma Yanjun, the training performance mainly includes two levels, one is the joint optimization with the training chip, giving full play to the computing power of the hardware itself; the other is the optimization of the deep framework itself to perform scheduling. At the same time, it also combines the model design to optimize the full link, and finally achieve the purpose of improving the training speed.

It can be said that from beginning to end, the design and optimization of each step of the flying propeller have provided a boost to ensure the speed of post-training.

After hard work, about 70% of the model training speed of the flying propeller has been faster than the industry's fastest level. These models mainly include two categories: one is the general model, such as chip adaptation will be applied in all places. The second is the application scenarios with high demand found based on the flying propeller perspective. For example, image classification is very common in logistics, e-commerce and other scenarios.

Flying propeller in the hardware computing power performance is also almost played to the extreme, including Intel, Nvidia, ARM and many other chip manufacturers support, with 22 domestic and foreign hardware manufacturers to complete 31 kinds of chip adaptation and joint optimization work. It is worth mentioning that many hardware manufacturers also take the initiative to contribute code to flying propellers in the open source community.

With a deep-fit chip like Nvidia, the flying propeller can give full play to the computing power it can use.

In the field of deep learning frameworks, Flying Propeller has made remarkable achievements, breaking the monopoly of Google and Meta, and becoming the veritable industry first of China's deep learning platforms.

Of course, it is undeniable that so far, the adaptation of deep learning frameworks is still more complicated, and the low-frequency long-tail problem in some industries cannot be solved through AI. But this is exactly the problem that the flying paddle has been trying to solve, and it has indeed achieved little success.

Ma Yanjun said frankly, "Although the deep learning framework belongs to the competition of high investment, long cycle and ecological grabbing, it has received strategic support from the state and enterprises, which is the key to opening the next AI era." ”

In the next decade of artificial intelligence, flying oars and Chinese AI will write a legend, we may wish to wait and see.

END

Is China an AI power or an AI power?

Read on