laitimes

Can deep learning not play big data? Small businesses train large models with new solutions

Zhi DongXi (public number: zhidxcom)

Compile the | Zhao Di

Edit | Yunpeng

According to foreign media reports, AI expert Wu Enda told the IEEE that the future development path of deep learning should shift from training models with big data to using high-quality data, providing opportunities for industries that cannot obtain large data sets to apply deep learning models. Ng was the director of the Stanford Artificial Intelligence Lab and previously led Google's Google Brain project.

Wu Enda believes that the training of depth models should shift from adjusting the code to adjusting the data, by adjusting the noise data that affects the training results (meaningless data), only a small number of high-quality data sets can be used to complete the model update, compared to the way of adjusting the code or directly providing massive data, this method is more targeted.

Landing AI, founded in 2017 by Ng, currently offers computer vision tools for manufacturing product inspection that can quickly label noise data, allowing customers to update models autonomously by changing data labels without having to adjust the model itself.

First, the potential of deep learning is strong, and big data training is integrated into the mainstream

The goal of artificial intelligence is to make machines "think" and "act" like humans, machine learning is an important way to achieve this vision, deep learning is an important branch of machine learning, with Professor Hinton's use of machine learning methods in 2012 to win the first prize in the ImageNet image recognition competition, deep learning has gradually received widespread attention, replacing traditional machine learning methods in many fields and becoming a popular research field in artificial intelligence.

In the past decade, deep learning has achieved rapid development, deep learning models are developing in an increasingly large direction, taking OpenAI's natural language processing model GPT series model as an example, in 2018, the parameter scale of GPT-1 exceeded 100 million, and by the time GPT-3 came out in 2020, the parameter scale has exceeded 10 billion, and the continuous emergence of super models shows the development potential of deep learning.

Can deep learning not play big data? Small businesses train large models with new solutions

However, Ng believes that although deep learning methods are widely used in many consumer-facing companies, these companies often have a large user base and can obtain large data sets for model training, but for many industries that do not have large data sets, they need to shift the focus from providing large amounts of data to providing high-quality data.

Second, move from code to data and train high-quality models with a small amount of data

Over the past decade, the dominant approach to training deep learning models has been to download a dataset and then focus on improving the code, but if a machine learning model is normal for most datasets and only deviates in one of the datasets, the approach of changing the entire model architecture to accommodate that dataset is inefficient.

Another approach is to start with data, known as "Data-centric AI," and the general approach is to improve the accuracy of the model by supplementing it with more data. In response, Ng said that if he tried to collect more data for all situations, this would be a lot of work, so he worked hard to develop tools to label noisy data (meaningless data) and provide a targeted approach to providing small but high-quality data for model training.

Ng said that the method he generally adopts is to enhance data or improve the consistency of data labels, such as when a dataset with 10,000 pictures, of which 30 similar pictures have different data labels, he hopes to build tools to identify pictures with inconsistent labels, so that researchers can quickly relabel them, rather than collecting massive amounts of data for model training.

Third, Landing AI provides data labeling tools, and users can update models independently

In 2017, Ng founded Landing AI to provide computer vision tools for product inspection for manufacturing companies and visual inspection for manufacturers' products. Ng said on the company's homepage that the use of the human eye to find board scratches exceeds the limit of the human eye's observation capabilities, but the accuracy of identification with AI is much higher.

Landing AI focuses on enabling customers to train machine learning models on their own, and the company mainly provides them with tools to label data when there are anomalies, so that companies can quickly update models themselves.

Can deep learning not play big data? Small businesses train large models with new solutions

Ng said that this is not just a manufacturing problem, taking the medical and health field as an example, each hospital's electronic version of the health record has its own format, it is unrealistic to expect programmers in each hospital to develop different models, the only way is to provide customers with tools so that they can build adaptable models, Landing AI is currently promoting such tools in the field of computer vision, and other AI fields need to do such work.

Conclusion: Deep learning methods or turn, data refinement is not more

For a long time, the update and optimization of deep learning models have mainly relied on the adjustment of the model, or directly supplementing more data, repeatedly training the model, and improving the accuracy of the model. Ng recommends marking and updating a small amount of noise data to achieve more targeted model optimization.

Previously, Ng Launched the "Data-Centric AI" competition on Twitter, making more practitioners pay attention to the methods of model optimization through data, and more and more researchers are using data augmentation, synthetic data and other methods to achieve more efficient model training. In the future, it is worth looking forward to whether data optimization will become the mainstream method for achieving model iteration.

Source: IEEE

Read on