laitimes

Jeff Dean, Google Bull, single author: The Golden Decade of Deep Learning Research

Selected from D DALUS

Written by Jeff Dean

Machine Heart Compilation

Editors: Du Wei, Chen Ping

Jeff Dean himself wrote an article on what contributed to the rapid development of deep learning during the decade of the 2010s. He also made his own vision for the future development of AI.

Ever since the dawn of computers, humans have dreamed of being able to create machines that can think. At a seminar organized at Dartmouth College in 1956, John W. Bush When McCarthy proposed the concept of artificial intelligence, a group of mathematicians and scientists came together to find ways to get machines to use language, form abstract understandings and concepts, and solve existing problems, and the workshop participants were optimistic that these problems would make real progress in a few months' time.

Artificial Intelligence was created by John McCarthy in 1956 at a workshop organized at Dartmouth College, where a group of mathematicians and scientists came together to find out how to get machines to use language, form abstractions and concepts, and solve problems that are now preserved, and workshop participants were optimistic that months of concentrated efforts would make real progress on those issues.

Jeff Dean, Google Bull, single author: The Golden Decade of Deep Learning Research

Participant in the 1956 Dartmouth Conference on Artificial Intelligence: Marvin Minsky, Claude · Shannon, Ray · Solomonov and other scientists. Photo by Margaret Minsky

The timing of setting aside a few months proved to be overly optimistic. Over the next 50 years, various methods of creating AI systems became popular but then became obsolete, including logic-based systems, rule-based expert systems, and neural networks.

It wasn't until around 2011 that AI began to enter a critical phase of development, making huge advances thanks to the renaissance of neural networks in deep learning, advances in these technologies that helped improve the ability of computers to see, hear, and understand the world around them, enabling AI to make great strides in science and other areas of human exploration. What are the reasons for this?

Recently, Jeff Dean, a Google bull, published an article titled "A Golden Decade of Deep Learning: Computing Systems & Applications", which explores what are the reasons for the progress of computing systems and applications in the golden decade of deep learning. This article focuses on three areas: the computing hardware and software systems that have enabled this advancement; some examples of exciting applications in the field of machine learning over the past decade; and how to create more powerful machine learning systems to truly achieve the goal of creating intelligent machines.

Jeff Dean's article was published in the special issue of AI & Society of D dalus, journal of the American Liberal Arts and Sciences Society.

Jeff Dean, Google Bull, single author: The Golden Decade of Deep Learning Research

Article address: https://www.amacad.org/sites/default/files/publication/downloads/Daedalus_Sp22_04_Dean.pdf

The golden decade of deep learning

Advances in artificial intelligence hardware and software

Hardware and software for artificial intelligence: Deep learning performs operations by combining different linear algebras (such as matrix multiplication, vector dot product, and similar operations), but this operation is limited, so we can build dedicated computers or accelerator chips to process them, which can bring new computational efficiencies and design options compared to general-purpose CPUs.

A computer or accelerator chip tailored to support such computations. This specialization enables new efficiency and design options compared to general-purpose CPUs that must run a wider variety of algorithms.

Back in the early 2000s, a handful of researchers began exploring the use of GPUs to implement deep learning algorithms. Then in 2004, computer scientists Kyoung-Su Oh and Keechul Jung demonstrated using GPUs to boost neural network algorithms nearly 20 times faster. In 2008, computer scientist Rajat Raina and colleagues demonstrated that in some unsupervised learning algorithms, the use of GPUs can be up to 72.6 times faster than using the best CPU-based implementations.

With the improvement of computing hardware, deep learning began to make significant progress in image recognition, speech recognition, language understanding, and so on. Deep learning algorithms have two very good features to build specialized hardware: first, they are very tolerant of reductions in precision; second, deep learning is computed in a way that consists of sequences of different linear algebraic operations on dense matrices or vectors.

To make deep learning and computing easier, researchers have developed open source software frameworks that today help a large number of researchers, engineers, etc. advance deep learning research and apply deep learning to a wider range of fields.

Some of the early frameworks included Torch, Theano, DistBelief, Caffe, and others, as well as the open source TensorFlow, developed by Google in 2015, which was a framework that allowed the expression of machine learning computations and incorporated the ideas of earlier frameworks such as Theano and DistBelief. TensorFlow has been downloaded more than 50 million times to date and is one of the most popular open source software packages in the world.

Released in 2016 a year after TensorFlow was released, PyTorch was popular with researchers for making it easy to express a variety of research ideas using Python. JaX, released in 2018, is a popular python-oriented open source library that combines sophisticated automatic differentiation with the underlying XLA compiler, which TensorFlow also uses to efficiently map machine learning computations to a variety of different types of hardware.

The importance of open source machine learning libraries and tools such as TensorFlow and PyTorch cannot be overemphasized, allowing researchers to quickly try out ideas. As researchers and engineers around the world make it easier to build on each other's work, progress across the field will accelerate!

Research has exploded

Advances in research, increasing computing power for ML hardware (GPUs, TPUs, etc.), and widespread adoption of open source machine learning tools (Tensor Flow, PyTorch, etc.) have led to a dramatic increase in research in machine learning and its applications. One of the strong metrics is the number of papers published in the field of machine learning on arXiv, a popular paper preprint hosting service that released more than 32 times as many papers in 2018 (more than doubling every two years). Working with experts in key areas such as climate science and health care, machine learning researchers are helping to solve important questions that benefit society and promote human progress. It can be said that we live in an exciting era.

Scientific and engineering applications proliferate

The transformative growth of computing power, advances in machine learning hardware and software, and the proliferation of machine learning research results have all led to a proliferation of machine learning applications in science and engineering. Through collaborations with key areas such as climate science and healthcare, machine learning researchers are helping to solve important questions that benefit society and advance human development. These areas of science and engineering include the following:

Neuroscience

molecular biology

Medical health

Weather, environment and climate challenges

robot

Accessibility

Personalized learning

Computer-aided creativity

Important building blocks

Transformers

ML of computer systems

For details of each segment, please refer to the original article.

The future of machine learning

There are some interesting research directions emerging in the ML research community that could be even more interesting if combined.

First, studying sparse activation models, such as the Sparsely-Gated MoE, shows how to build very large-capacity models where only a subset of the model is "activated" for any given instance (such as two or three out of 2048 experts).

Second, study automated machine learning (AutoML), where techniques such as neural architecture search (NAS) or evolutionary architecture search (EAS) can automatically learn the efficient structure or other aspects of an ML model or component to optimize for accuracy for a given task. AutoML typically involves running many automated experiments, each of which may contain huge amounts of computation.

Finally, multitasking at the appropriate scale of a few to dozens of related tasks, or transferring learning from models trained on large amounts of data for related tasks and then fine-tuning on small amounts of data for new tasks, has proven to be very effective at solving a wide range of problems.

A very interesting research direction is to combine the above three trends, in which a system runs on large-scale ML accelerator hardware. The goal is to train a single model that can perform thousands or even hundreds of tasks. This model may consist of many components of different structures, and the data flow between instances is relatively dynamic on an instance-by-instance basis. The model may use techniques such as sparse gating experts to mix and learn routing to produce a very large-volume model, but one of the tasks or instances is only a small fraction of the total components in the sparse activation system.

Figure 1 below depicts a multitasking, sparsely activated machine learning model.

Jeff Dean, Google Bull, single author: The Golden Decade of Deep Learning Research

Each component itself may be running some AutoML-like schema search to adapt the component's structure to the type of data that is routed to it. The new task can take advantage of components trained on other tasks as long as it is useful. Jeff Dean hopes that with very large-scale multitasking learning, shared components, and learning routing, the model can quickly complete new tasks with high accuracy, even if there are relatively few new instances of each new task. The reason is that the model is able to take advantage of the expertise and internal representation it has acquired in accomplishing other related tasks.

Building a single machine learning capable of handling millions of tasks and learning to automate new tasks is a real challenge in the field of artificial intelligence and computer systems engineering. This requires expertise in many areas such as machine learning algorithms, responsible AI (such as fairness and explainability), distributed systems, and computer architecture to advance the field of AI by building a system that can be generalized to independently solve new tasks in all areas of machine learning applications.

Responsible AI development

While AI has the ability to help in every aspect of people's daily lives, all researchers and practitioners should ensure that methods are developed in a responsible manner, scrutinizing bias, fairness, privacy issues, and other social factors about how AI tools work and influence others, and working to address all of these issues in an appropriate way.

It's also important to develop a clear set of principles to guide responsible AI development. In 2018, Google released a set of AI guidelines to guide businesses in ai-related work and use. This set of AI guidelines lists important areas to consider, including bias, security, fairness, accountability, transparency, and privacy in machine learning systems. In recent years, other agencies and governments have followed this model and issued their own guidelines for the use of AI. Jeff Dean hopes this trend will continue until it is no longer a trend and becomes the standard in all machine learning research and development.

Jeff Dean's vision for the future

The 2010s was indeed a golden decade for deep learning research and progress. Some of the most difficult questions posed at the Dartmouth Symposium on Artificial Intelligence in 1956 made great strides in this decade. Machines are able to see, hear, and understand language the way early researchers wanted. Success in the core areas has led to significant advances in many areas of science, not only making smartphones smarter, but also opening up more possibilities in the future as people continue to create more complex, powerful, and deep learning models that are helpful for everyday life. Thanks to the help of powerful machine learning systems, people will become more creative and capable in the future.

Read on