Nobel Prize Moment: They Bridge the Bridge Between AI and Physics

Over the past 15 to 20 years, there has been an explosion in the development of machine learning, which uses a structure known as artificial neural networks. Nowadays, when we talk about "artificial intelligence", we are usually referring to this type of technology.

Although computers can't think, machines can now mimic functions such as memory and learning. These functions were made possible with the help of John J. Hopfield ·· Geoffrey E. Hinton and Geoffrey E. Hinton. They have been working on artificial neural networks since the 80s of the 20th century. Using the basic concepts and methods of physics, they developed techniques that could process information using network structures.

They were awarded the 2024 Nobel Prize in Physics for their "fundamental discoveries and inventions of machine learning through artificial neural networks."

John · Hopfield is an United States physicist, neuroscientist, and professor at Princeton University. In 1982, he invented the famous "Hopfield neural network", which was the first neural network model capable of storing multiple patterns and having memory functions, which was an important milestone in the early development of neural networks. The Hopfield neural network makes use of physics that describes the atomic spin properties of materials. Its birth paved the way for the development of recurrent neural networks, and its proposed energy minimization principle has had a profound impact on solving optimization problems.

Jeffrey Hinton · British-born Canada computer scientist and neuroscientist, professor at the University of Toronto, Canada former head of Google Brain. Hinton was one of the early (and never abandoned) researchers of neural networks. In the 80s of the 20th century, he made important contributions to the introduction of backpropagation algorithms into multi-layer neural network training, and invented the "Boltzmann machine", a stochastic recurrent neural network developed on the basis of the Hopfield neural network using statistical physics. In 2012, Hinton and his two students, Alex Krizhevsky and Ilya Sutskever, invented AlexNet, won the computer vision competition ImageNet by a landslide victory, setting a milestone in the development of deep neural networks and inspiring a large number of research using convolutional neural networks (CNNs) and graphics processing units (GPUs) to accelerate deep learning. Hinton, along with Yoshua Bengio and Yann LeCun, is known as the "Big Three of Deep Learning" and the "Godfather of AI", and co-won the 2018 Turing Award.

Mimic the brain

Techniques such as "machine learning" and "artificial neural networks" that we often hear about today were originally inspired by the structure of the brain. In artificial neural networks, the neurons of the brain are represented by nodes with different values. These nodes influence each other through connections that analogize synapses, and these connections can be strengthened or weakened. Such networks can be trained, for example by strengthening connections between nodes that have high values at the same time.

Left: The brain's neural network is made up of living cells (neurons) with complex internal mechanisms. They can send signals to each other through synapses. As we learn, the connections between some neurons become stronger, while others become weaker.

Right: Artificial neural networks are built from nodes that encode numeric values. These nodes are connected to each other, and when the network is trained, the connections between the nodes that are activated at the same time become stronger, while the others become weaker.

At first, scientists working on neural networks simply wanted to understand how the brain works. In the 40s of the 20th century, researchers began to explore the mathematics behind brain neurons and synaptic networks. In addition, the field of psychology also provides important clues to this field, with neuroscientist Donald Hebb's hypothesis stating that learning occurs because the connections between neurons are strengthened when they work together.

Subsequently, scientists followed this idea and built artificial neural networks through computer simulations to reproduce the function of brain networks. In these simulations, nodes make up the neurons of the brain, each node is given a different value, and synapses are represented by connections between nodes that can be strengthened or weakened. Donald Hebb's hypothesis is still one of the basic rules for updating artificial networks through training.

At the end of the 60s of the 20th century, some dismal theoretical results led many researchers to suspect that these neural networks would never really be useful. However, by the 80s of the 20th century, the influence of some important ideas, including the work of this year's two laureates, had rekindled interest in artificial neural networks.

Associative memory

Imagine trying to recall a rather unusual and rarely used word, such as the one used to describe the sloping floor in a movie theater or barrier-free passage. You search in your mind: it's kind of like a ramp...... Maybe it's a radial? No, it's not. Oh, it's rake!

This process of searching for similar words to find the right word is similar to Hopfield's 1982 associative memory model – the Hopfield neural network – which stores patterns and can reproduce them. The "Hopfield Neural Network" makes use of physics that describes the atomic spin properties of materials. The entire network is described in a way that is equivalent to the energy of the spin system in physics, and is trained by finding the values of the connections between the nodes so that the saved image has low energy. When a distorted or incomplete image is fed into the Hopfield neural network, it systematically traverses the nodes and updates their values so that the network's energy drops. In this way, the network gradually finds the saved image that most closely resembles the imperfect image that was entered.

Why did Hopfield come up with the idea of using physics to describe "biology"? Once, he was invited to a conference on neuroscience and was exposed to research on the structure of the brain. He was fascinated by the content and got him thinking about the dynamics of simple neural networks – when neurons work together, they produce new, powerful properties that are hard to detect if you only focus on each individual neuron in the network.

In 1980, Hopfield left Princeton University, where he was currently employed, and his research interests went beyond the fields of his fellow physicists. Later, he accepted a professorship in chemistry and biology at the California Institute of Technology in Pasadena, Southern California. There, he was able to use the school's computer resources to conduct free experiments and develop his ideas for neural networks.

At the same time, he did not abandon his foundation in physics, but also drew inspiration from it to understand how a system made up of many small components working together can generate new and interesting phenomena. He particularly benefits from magnetic materials with special properties that derive from their atomic spin – a property that makes each atom a tiny magnet – a property that makes each atom a tiny magnet in which the spins of neighboring atoms interact with each other, which allows the formation of regions with the same spin direction. Using the principles of physics that describe how material properties change when spins interact with each other, he constructed a network of patterns with nodes and connections.

The network saves the image with Terrain

In Hopfield's neural networks, the strength of the connections between nodes is different. Each node can store a separate value – in Hopfield's early work, this value could be 0 or 1, like pixels in a black-and-white photograph.

Hopfield uses spin energy in physics to describe the overall state of this network. The energy is calculated by a formula that makes use of the values of all the nodes and the strength of all the connections between them. The Hopfield neural network is programmed by feeding an image into a node, which is given a value of black (0) or white (1). The energy formula then adjusts the connection of the network so that the stored image is less energetic. When another pattern is fed into the network, the program traverses each node according to specific rules to see if the network's energy decreases if the value of that node is changed. Change the color of a black pixel if it is found that changing it to white reduces energy. This process continues until the energy can no longer be reduced. When this is achieved, the network is often already able to reproduce the original image used for training.

If you're only storing one pattern, this might not seem so amazing. You might be thinking, why not just save the image itself and compare it to another image you want to test? But what makes Hopfield's approach special is that it can store multiple images at the same time and can often distinguish between them through the network.

Hopfield likens the process of searching for a particular saved state in the web to "a small ball rolling between hills and valleys," with the rolling ball slowed down by friction. If the ball is dropped from a particular location, it will roll into the nearest trough and stop there. If the pattern fed to the network is close to a pattern that has already been stored, it will continue in the same way until it reaches some point in the energy landscape and thus finds the closest pattern in memory.

霍普菲尔德神经网络可以用来重现包含噪声或被部分擦除的数据。图片来源：Johan Jarnestad/The Royal Swedish Academy of Sciences

Hopfield et al. went on to delve into the details of the functionality of the Hopfield neural network, including nodes that could store any value, not just 0 or 1. If you think of nodes as pixels in a picture, they can have different colors, not just black or white. The improved approach makes it possible to store more pictures and distinguish between them, even if they are very similar. As long as the information is constructed from many data points, it is possible to identify or reconstruct any information.

"Boltzmann machine"

It's one thing to remember an image, but it takes a little more effort to understand what an image means.

Even young children can confidently tell if an animal is a dog, a cat, or a squirrel. At first children may make occasional mistakes, but soon they can get it right almost every time. Children can learn this even without seeing any diagrams or explanations about concepts such as species or mammals. After being exposed to a few examples of each animal, the child will gradually become aware of the different types of animals. By observing and experiencing their surroundings, people can learn to recognize cats, or understand a word, or enter a room and notice something change.

When Hopfield published his paper on associative memory, Jeffrey · Hinton was working at Carnegie Mellon University in the United States. Having studied experimental psychology and artificial intelligence in England and Scotland, he wondered if machines could learn to process patterns, classify and interpret information on their own, just like humans. Together with his colleague Terrence Sejnowski, Hinton expanded and built a new model based on the Hopfield neural network and ideas from statistical physics.

Statistical physics describes a system made up of many similar elements, such as molecules in a gas. It is very difficult, if not impossible, to trace the behavior of all the individual molecules in a gas. But we can look at all the molecules as a whole and thus determine the overall properties of the gas, such as pressure or temperature. There are many potential ways to do this, as gas molecules diffuse in a certain volume at different velocities and still produce the same collective properties.

Statistical physics analyzes the various states in which individual components can coexist and calculates the probability of their occurrence. Some states are more likely to occur than others, depending on the amount of energy available, and the 19th-century physicist Ludwi·g Boltzmann used equations to describe this behavior. Hinton's network takes advantage of this equation. In 1985, he published this neural network under the striking name "Boltzmann machine".

Boltzmann machines typically use two different types of nodes: one that receives information and is known as a visible node; The other type of node constitutes a hidden layer, and the value of the hidden node and its connection also affects the energy of the entire network.

Such a machine operates through a rule that updates node values one by one. Eventually, the Boltzmann machine will enter a state where the pattern of nodes can change, but the overall nature of the network remains the same. According to the Boltzmann equation, each possible model has a specific probability that is determined by the energy of the network. By the time the machine stopped running, it had generated a new pattern, which made the Boltzmann machine an early instance of generative modeling.

The Boltzmann machine is able to learn – not by instructions, but by examples of input. It is trained in such a way that the values in the network connection are updated so that the example patterns entered into the visible nodes at the time of training have the highest probability of appearing while the machine is running. If the same pattern is repeated multiple times during training, the probability of this pattern will be higher. Training also affects the probability that the machine will output a new pattern that is similar to the training example.

The Boltzmann machine is trained to recognize familiar features in information it has not seen before. Imagine that when you meet a friend's siblings, you can immediately tell that they must be relatives. Similarly, if Boltzmann sees a new example that belongs to a certain category in the training dataset, it will be able to identify it and distinguish it from dissimilar information.

In its initial form, the Boltzmann machine was quite inefficient, and it took a long time to find a solution. Things get more interesting as it develops in various ways, and Hinton has been exploring these developments as well. Later versions have been simplified, as some of the connections between the units have been removed. The results suggest that this may make the machine more efficient.

In the 90s of the 20th century, many researchers lost interest in artificial neural networks, but Hinton was one of those who continued to work in the field. He also helped kickstart a new round of exciting results. In 2006, together with colleagues Simon Osindero, Yee Whye Teh, and Ruslan Salakhutdinov, he developed a method to pre-train a network through a series of layered stacked Boltzmann machines. This pre-training provides a better starting point for connections in the network, optimizing training to recognize elements in an image.

Boltzmann machines are often used as part of larger networks. For example, it can recommend movies or TV series based on viewers' preferences.

Machine Learning: Present and Future

The work of John · Hopfield and Jeffrey · Hinton since the 80s of the 20th century laid the foundation for the machine learning revolution that began around 2010.

The AI boom we are witnessing today is due to the massive amount of data that can be used to train networks, as well as a huge increase in computing power. Today's artificial neural networks are often very large and composed of multiple layers. These are known as deep neural networks, and their training method is known as deep learning.

In Hopfield's 1982 article on associative memory, he used a network of 30 nodes. If all nodes are connected to each other, there will be 435 connections. Nodes have their own values, connections have different strengths, and there are less than 500 parameters to track in total. He also tried a network of 100 nodes, but it was too complicated for computers at the time. We can compare it to the large language models represented today by ChatGPT, which are built as networks and can contain more than a trillion parameters.

Many researchers are developing application areas for machine learning. It remains to be seen which areas are the most viable, while there has been a wide range of ethical discussions surrounding the development and use of this technology.

Physics has provided a tool for the development of machine learning, and in turn, it has been used for a long time in physics research, including the use of machine learning to sift through and process the large amounts of data needed to discover the Higgs particle, reduce the noise of measuring gravitational waves from black hole collisions, or search for exoplanets.

Nobel Prize Moment: They Bridge the Bridge Between AI and Physics |

Read on

2024 Nobel Laureate in Physics Hinton: From Academic Dilemma to AI Legend

At the age of 65! The sudden death of an Oxford particle physicist in his office has made mankind recognise heavy quarks

Nobel laureate Hopfield: You might as well call physics what people who have been trained in physics do

Why is the 2024 Nobel Prize in Physics awarded to machine learning?

The Nobel Prize in Physics was awarded to the AI boss, and the father of generative AI angrily denounced: they don't deserve the prize! Netizen: ChatGPT is expected to win a literary prize?

The Nobel Prize does not want to miss out on AI: the academic community is shocked, and physics must also "know how to surf"

The 2024 Nobel Prize in Physics has been announced! Promote the "explosive" development of artificial intelligence and machine learning technology

The 2024 Nobel Prize in Physics is awarded to two pioneers in artificial intelligence research, Hinton: I didn't expect it

Learning difficulties should not be generalized, and there are no physical learning difficulties or chemistry learning difficulties

The Nobel Prize in Physics awarded to the AI "god" has caused controversy, but it has released a new signal!

Physics + Artificial Intelligence = Nobel Prize in Physics 2024!

Experts interpret the Nobel Prize in chemistry: If there is no chemistry prize this year, there may not be a physics prize

Physics doesn't say no: How did Musk rewrite the history of human spaceflight?

Archetype AI released a large model of Newtonian physics to learn physics principles from sensor data

Ma Jianpeng: "If there was no chemistry prize this year, there might not be a physics prize"