laitimes

Physical systems perform machine learning calculations, a deep physical neural network trained using backpropagation

Edit | Radish peel

Deep learning models have become a universal tool in science and engineering. However, their energy needs are now increasingly limiting their scalability. Deep learning accelerators are designed to perform deep learning efficiently, often for the inference phase, and often by leveraging a physical substrate outside of traditional electronic devices. Methods to date have been unable to apply backpropagation algorithms to train unconventional new hardware in situ. The advantages of backpropagation make it a de facto method of large-scale neural network training, so this flaw constitutes a major obstacle.

Here, researchers at Cornell University introduce a hybrid in situ-computer algorithm called physical perception training, which applies backpropagation to train controlled physical systems.

Just as deep learning enables computation through deep neural networks composed of layers of mathematical functions, the method allows researchers to train deep physical neural networks composed of layers of controlled physical systems, even if the physical layers lack any mathematical isomorphisms with traditional artificial neural network layers.

To demonstrate the universality of the method, the researchers trained various physical neural networks based on optics, mechanics, and electronics to experimentally perform audio and image classification tasks. Physical perception training combines the scalability of backpropagation with the automatic mitigation of defects and noise that can be achieved by in situ algorithms.

Physical neural networks have the potential to perform machine learning faster and more energy efficiently than traditional electronic processors and, more broadly, can impart physical functions that physical systems are automatically designed, such as robotic materials and smart sensors.

The study, titled "Deep physical neural networks trained with backpropagation," was published in Nature on January 26, 2022.

Physical systems perform machine learning calculations, a deep physical neural network trained using backpropagation

As with many historical developments in ai, the widespread adoption of deep neural networks (DNNs) is partly made up of collaborative hardware.

In 2012, building on earlier research, Krizhevsky's team showed that backpropagation algorithms can be efficiently executed using graphics processing units to train large DNNs for image classification. Since 2012, the computational demand for DNN models has grown rapidly, surpassing Moore's Law.

Today, DNNs are increasingly limited by hardware energy efficiency. Emerging DNN energy problems have inspired hardware for special purposes: DNN "accelerators" are largely based on direct mathematical isomorphisms between hardware physics and mathematical operations in DNNs. Some accelerator proposals use physical systems other than traditional electrons, such as optical and analog electron cross-arrays. Most devices target the inference phase of deep learning, which accounts for 90% of the energy cost of deep learning in commercial deployments, although more and more devices are also processing the training phase.

Physical systems perform machine learning calculations, a deep physical neural network trained using backpropagation

Illustration: Introduction to PNN. (Source: Thesis)

However, achieving trained mathematical transformations by designing hardware for rigorous, operation-by-operation mathematical isomorphisms is not the only way to perform efficient machine learning. Instead, researchers can directly train the physical transformations of the hardware to perform the required calculations. Here, the researchers refer to this approach as a physical neural network (PNN) to emphasize that it is the physical processes that are being trained, not the mathematical operations.

This distinction is not just semantic: by breaking the traditional software-hardware divide, PNN offers the possibility of building neural network hardware from almost any controllable physical system. As anyone who simulates the evolution of complex physical systems understands, physical transformations are often faster and consume less energy than digital simulations.

This suggests that if these physically transformed PNNs are most directly utilized, certain computations may be performed more efficiently than traditional paradigms, providing a pathway to scalable, more energy-efficient, and faster machine learning.

Physical systems perform machine learning calculations, a deep physical neural network trained using backpropagation

Illustration: Example PNN implemented using a broadband optical SHG experiment. (Source: Thesis)

PNNs are particularly useful for DNN-like calculations, far exceeding digital logic or even other forms of analog computation. As they expect from their robust processing of natural data, DNNs and physical processes have many structural similarities, such as hierarchy, approximate symmetry, noise, redundancy, and nonlinearity.

As physical systems evolve, the transformations they perform are effectively equivalent to approximations, variations, or combinations of mathematical operations commonly used in DNNs, such as convolutional, nonlinear, and matrix vector multiplication. Thus, using controlled physical transformation sequences, researchers can implement trainable, hierarchical physical computations, known as depth PNNs.

While the paradigm of building computers by directly training physical transformations originated in evolved computational materials, it is emerging today in a variety of fields, including optics, spintron nano-oscillators, nanoelectronic devices, and small quantum computers.

A closely related trend is physical reservoir computing (PRC), in which the transformation of an untrained physical "reservoir" is linearly combined by trainable output layers. Although the PRC utilizes common physical processes for computation, it cannot achieve hierarchical computations similar to DNNs.

In contrast, the method of training the physical transformation itself can overcome this limitation in principle. To train physical transformations through experiments, researchers often rely on gradient-free learning algorithms. Gradient-based learning algorithms, such as backpropagation algorithms, are considered essential for efficient training and good generalization of large-scale DNNs.

As a result, there are proposals to implement gradient-based training in physical hardware. However, these inspiring proposals make assumptions that exclude many physical systems, such as linear, dissipative-free evolution, or gradient dynamics that describe the system well. The most common recommendation is to overcome these limitations by training on a computer, i.e. learning entirely in numerical simulations. Despite the power given to the pervasiveness of computer training, simulations of nonlinear physical systems are rarely accurate enough to accurately transfer computer-trained models to real devices.

Physical systems perform machine learning calculations, a deep physical neural network trained using backpropagation

Figure: Physical awareness training. (Source: Thesis)

Here, the Cornell team demonstrates a general framework for using backpropagation to directly train arbitrary physical systems to perform DNNs, known as PNNs. Their approach is implemented through a hybrid in situ-computer algorithm called physical perception training (PAT). PAT allows researchers to efficiently and accurately perform backpropagation algorithms on any physical input-output conversion sequence.

They demonstrated the universality of this approach by experimentally performing image classification using three different systems: multimode mechanical oscillation that drives metal plates, simulated dynamics of nonlinear electronic oscillators, and ultrafast optical secondary harmonic generation (SHG).

The researchers obtained an accurate hierarchical classifier that utilized each system's unique physical transformations and essentially mitigated each system's unique noise processes and defects.

Although PNNs are very different from traditional hardware, it is easy to integrate them into modern machine learning. Experiments have shown that PNNs can be seamlessly combined with traditional hardware and neural network approaches through a hybrid physical-digital architecture, where traditional hardware learning uses PAT to collaborate with non-traditional physical resources.

Ultimately, PNNs offer ways to improve the energy efficiency and speed of machine learning by multiple orders of magnitude, as well as ways to automate the design of complex functional devices such as functional nanoparticles, robots, and smart sensors.

discuss

The results show that it is feasible to train a controlled physical system to perform DNN calculations. In principle, many systems that are not normally used for computing seem to offer the ability to perform partial machine learning inference calculations, several orders of magnitude faster and more energy efficient than traditional hardware.

However, there are two considerations to be aware of. First, some systems may be well suited to speeding up restricted class computations that share the same constraints because of potential symmetry and other constraints. Second, PNNs trained using PAT can only provide significant advantages during inference because PAT uses a digital model. Therefore, in a hybrid network, the researchers hope that such a PNN can be used as a resource for traditional general-purpose hardware, rather than as a complete replacement.

Physical systems perform machine learning calculations, a deep physical neural network trained using backpropagation

Illustration: Classification of images with different physical systems. (Source: Thesis)

The technology of field training hardware and reliable computer training methods compensate for these weaknesses. Devices trained using in-situ learning algorithms will perform learning entirely in hardware, potentially enabling faster and more energy-efficient learning than current methods.

This type of equipment is suitable for environments that require frequent retraining. However, in order to perform both learning and inference, these devices have more specific hardware requirements than inference-only hardware, which may limit the inference performance they can achieve. Computer training can train many physical parameters of a device, including parameters that are permanently set during the manufacturing process.

Because the generated hardware does not perform learning, it can be optimized for inference. Although accurate, large-scale computer training has been implemented, this can be achieved using only analog electronics, which can use precise simulations and controlled manufacturing processes.

PAT can be used in environments where the gap between simulation and reality cannot be avoided, for example, if the hardware may be designed within the constraints of manufacturing tolerances, operating outside the usual regime, or based on a platform other than traditional electronic devices.

Improvements to PAT extend the utility of PNNs. For example, the backpass of PAT can be replaced by neural networks that directly predict the update of physical system parameters. Implementing this "teacher" neural network using PNNs will allow for subsequent training without digital help.

So far, the focus of this work has been on the potential application of PNN as a machine learning accelerator, but PNN is also promising in other applications, especially those that process or generate physical rather than digital data.

PNNs can perform calculations on data within their physical domain, allowing smart sensors to preprocess information before transitioning to the electronic domain (e.g., low-power, microphone coupling circuitry is tuned to recognize specific hot words).

Since the sensitivity, resolution, and energy efficiency that many sensors can achieve are limited by the conversion of information into the digital electronics domain and the processing of that data in digital electronics, PNN sensors should have an advantage. More broadly, with PAT, it is possible to simply train the complex functions of a physical system. While machine learning and sensing are important features, they are just two of the many features that the PAT and PNN concepts can apply.

Related: https://techxplore.com/news/2022-01-physical-machine-learning.html

Artificial Intelligence × [ Biological Neuroscience Mathematics Physics Materials ]

"ScienceAI" focuses on the intersection and integration of artificial intelligence with other cutting-edge technologies and basic sciences.

Welcome to follow the stars and click Likes and Likes and Are Watching in the bottom right corner.

Read on