laitimes

AI, how fast

AI, how fast

Artificial intelligence (AI) has long penetrated into all corners of our lives.

It is not only the strongest opponent in the competition such as Go and video games, but also helps scientists in various fields to solve problems, not only that, our communication software in the voice to text, one-click translation function, or shopping website intelligent recommendation... Behind them may all be "hiding" AI.

No matter what kind of AI, it is inseparable from the support of powerful computing systems. Just as past industrial revolutions were first to be supported by major breakthroughs in precision measurement, raw materials, and manufacturing, AI also needed entirely new technologies to drive.

In such an "AI era", what is the experience of computing systems from dozens of AI leading institutions around the world to "compete on the same stage"?

Such an "Olympic" really exists. MlCommons, a well-known open consortium for machine learning, organizes a benchmark called MLPerf every year that gives uniform measurements of the speed and efficiency of AI computing systems and allows researchers to advance the best ideas and solutions by comparing various technological innovations.

In April, MLPerf just announced the results of the first round of the 2022 inference performance test (MLPerf Inference V2.0), giving us a new understanding of the performance of today's top computing systems.

Training and inference of AI

MlPerf testing is conducted 4 times a year, which includes training performance testing and inference performance testing. In order to better understand the "training" and "reasoning" in this context, we can first briefly talk about how AI works, why they are so different and so attractive.

AI scholar Janelle Shane in "You Look Like... I Love You: How AI Works and the Strangeness It Brings to the World is a vivid example of how to train AI to tell jokes.

If we use traditional programming to make a computer tell a joke, we must tell it all the "rules" in the joke in programming language. No matter how complex the program ends up, essentially we still set the rules for the computer to solve the problem concretely.

But training AI is very different, and many AI experts agree that writing AI programs is more like "teaching students" than traditional programming.

In Shane's words, in simple terms (of course, the reality is obviously not so simple), we just throw some existing jokes at the AI, tell it with some basic instructions that the goal is to write jokes, plus a whole bunch of random characters. "Then I'll go get my coffee." And AI began to work.

It might start by guessing, studying datasets again and again and adjusting itself, figuring out more rules on its own. Of course, some rules can also accidentally lead them astray, for example, a Stanford university research team tried to train AI to distinguish between healthy skin and skin cancer pictures, only to accidentally train a ruler measurement tool, because many tumor photos in the dataset have a ruler for scale size.

But more often, under proper training, AI can discover a large number of rules that programmers or even no one knows, and establish their own "knowledge system".

The AI that has completed the training is like mastering the test center, and they also have the ability to apply these learnings to specific scenarios, and quickly give answers based on unprecedented new data, which is the so-called ability to reason.

This is the key to AI becoming the most attractive solution, with endless potential and creativity.

However, the entire process is inseparable from massive amounts of data and calculations, and everything needs to be completed in the shortest possible time. This places a great deal of demand on the performance of the computing system.

MLPerf testing is a comprehensive investigation of computational performance from the two aspects of training and reasoning.

What does MLPerf measure?

In MLPerf testing, training performance testing is relatively simple. It is mainly divided into two scenarios: stand-alone and clustered, examining the time it takes for the computing system to complete the training of mainstream AI models, the faster it is completed, which naturally means that the stronger the system performance.

But the reasoning performance test released this time is more comprehensive and more complex, it is like an all-around competition or triathlon in sports competitions, more accurately a "ironman 33". The inference performance test sets various indicators for different scenarios to examine the speed and ability of the computing system to complete various AI tasks. It has also become one of the authoritative benchmarks in the industry.

Inference performance testing can first be divided into two categories: fixed tasks and open optimization. Among them, fixed tasks emphasize more similar comparisons, such as making different computing systems "race" on the same starting line, so they are relatively more important.

AI, how fast

Inference performance testing is divided into two types: fixed task and open optimization. (Figure/Principle)

In the fixed task, in order to ensure comprehensiveness, 6 major application scenarios are included, and each scenario selects the most mainstream AI model as the test task.

AI, how fast

6 application scenarios for inference performance testing. (Figure/Principle)

These scenarios are very close to practical applications and are closely related to our lives. To take some of the simplest examples, such as in computer vision, image classification is one of the most basic problems. Whether it's when we retrieve pictures online, or mobile albums help us automatically categorize photos, or intelligently analyze videos, one of the basic tasks of a computer is to distinguish between different pictures based on the information in the images.

For computer-human interactions, the language model is fundamental. NLP (natural language processing) that can understand human language can be applied to translation, question answering, text generation and other aspects, and all kinds of intelligent assistants are inseparable from it.

In addition, some more specialized directions are included in the application scenario, such as biomedical image segmentation. The medical images we take in hospitals such as CT and MRI are not the same as ordinary photos, they are often "blocky", that is, a whole picture is composed of many slices, which also brings additional challenges to image processing. Biomedical image segmentation is the segmentation of organs or lesions in these medical images to identify and analyze them more accurately, which is also a key step in computer-aided medicine.

For these application scenarios, different dimensions are set for testing. It can be understood that this is actually to further refine the application scenario and create a richer and more practical situation, so as to comprehensively test the performance of the computing system in various possible situations.

AI, how fast

For different models, the test also set different dimensions of investigation, including different scenarios in the data center and edge. (Figure/Principle)

New records, new futures

A total of 19 institutions participated in the MLPF inference performance test, and a total of more than 1,000 data were submitted.

Among them, the Inspur AI server won 27 championships in a total of 33 tasks, including all 16 championships in the data center and 11 championships in the 17 individual events at the edge, setting a new AI inference speed record in each task.

AI, how fast

Inspur AI server set a record in this MMLPerf inference performance test (data center offline scenario). (Figure/Principle)

This represents the most advanced level of AI computing available today. With the continuous deepening of AI applications in various industries, faster inference speed will bring higher AI application efficiency and capabilities, and accelerate the intelligent transformation of the industry.

Compared with the previous test results, the Inspur AI server has improved the inference performance of image classification, speech recognition and natural language processing tasks by 31.5%, 28.5% and 21.3% respectively, in other words, the system has the ability to complete various intelligent tasks more efficiently and quickly in various scenarios such as automatic driving, voice conferencing, intelligent question answering and smart medical treatment.

Driven by powerful computing power, digital technology will be applied in a deeper way in the physical world. In the future, we may all have the opportunity to drive highly automated cars, with the help of intelligent transportation systems, in the fastest and safest way to get to where we want to go. As long as we say two words to the intelligent assistant, the supplies we order can be delivered immediately in the shortest possible time. With real-time recognition and translation of speech, the barriers to language have gradually melted, and we have more opportunities to communicate and understand the wider world.

As Inspur Information said, in the era of wisdom, computing power is productivity, and intelligent computing power is innovation. It will become an important force to promote a new round of scientific and technological revolution and industrial change.

#创作团队:

Written by: Takeko

Typography/Design: Wenwen

#参考来源:

https://mlcommons.org/en/

https://mlcommons.org/en/news/mlperf-inference-1q2022/

[Beauty] Janelle Chanet, "You Look Like... I Love You: How AI Works and the Strangeness It Brings to the World," CITIC Publishing Nautilus, April 2021

Information on test results is provided by Inspur Information.

#图片来源:

Cover image: Principle

首图:Mike MacKenzie, Flickr, CC BY

*This push is sponsored by Inspur Information.

Read on