Is the end of an artificial neural network a neuron?

- Maybe it's really possible.
Currently, state-of-the-art AI systems mimic the human brain by creating multi-layer neural networks aimed at cramming as many neurons as possible into the smallest possible space.
Unfortunately, such a design requires a lot of energy and other resources, and the output is dwarfed by a powerful and "energy-efficient" human brain.
Recently, a team at the Technical University of Berlin offered a new idea: to fold a deep neural network of any size into a single neuron with multiple delay feedback loops.
Papers on research findings were published in a nature sub-journal.
The concept of "AI brain of a single neuron" has undoubtedly pointed out a new direction for the development of neural networks.
Let's take a look at what kind of research this is!
Specific methods
The research team devised a method of full time folding (Fit-DNN) of a multilayer feed-forward deep learning neural network.
The birth of Fit-DNN was inspired by the concept of "folded-in-time", which uses a single delay loop configuration and time multiplexing of input data to simulate a ring topology.
Traditional deep neural networks consist of multiple layers of neurons, coupled with feed-forward structures.
If you use a neuron to implement the functionality of traditional DNNs, you need to preserve the logical order of the layers while finding a way to sequence the operations within the layers.
This can only be achieved by segmenting previously simultaneous processes in time: individual neurons receive the right input at the right time, simulating the individual neurons of each layer in turn.
The traditional connections between adjacent layers are transformed into connections between individual neurons at different times, that is, the interlayer connections become delayed connections.
The same neuron is weighted differently at different times, and the weights are determined by the backpropagation algorithm.
This is similar to a single guest simulating a conversation at a large dining table by quickly switching seats and saying each part.
The core of a Fit-DNN consists of a single neuron with multiple delays and modulation feedback, and the figure above shows its structure:
The black circle marked with the letter f represents the neuron, and its signal at t is x(t); this signal is the sum of the data J(t), offset b(t), and feedback signal.
Adjustable elements are represented by blocks: the data signal is generated by the input vector u, and the matrix in the blue square contains the input weights. The offset coefficient produces an offset signal in a gray square. Each feedback loop implements a delay and a time modulation to generate a feedback signal.
Finally, the output is obtained from the signal x(t) using the output weight matrix.
Note that in order to obtain the data signal J(t) and output, appropriate pre-processing or post-processing is required.
Equivalence with traditional multi-layered warp networks
Can a single neuron's Fit-DNN really be functionally equivalent to a multilayer neural network?
As shown in the figure below, Fit-DNN can convert the dynamics of a single neuron with multiple delay rings into a DNN.
Figure a shows that the temporal evolution of signal x(t) can be divided into time intervals of length T, each of which simulates a hidden layer; the black dots on the solid line represent nodes, and θ represents the node separation values.
Figure b shows the interval in which the original time trajectory was cut to length T, within which nodes are marked according to their network positions.
Figure c is derived from the rotation of Figure b, on which an input and an output layer are added.
These connections are determined by dynamic dependencies between nodes that can be calculated precisely based on the value of θ.
When the node separation value θ is large, a familiar multi-layer DNN shape will form between the network nodes.
However, when the node separation value θ is small, the state of each node depends on the previous node, rather than being completely independent. These additional "inertial" connections are represented by black arrows in Figure C.
Although the researchers restored a fully connected DNN with a D = 2N- 1 delay loop, simulation tests showed that this did not fully meet the requirements.
In fact, sufficient performance can be achieved with fewer delay loops. In this case, Fit-DNNs will implement a special type of sparse DNNs.
It can be seen that under certain conditions, Fit-DNN can fully recover a standard DNN without convolutional layers, at this time, its performance is the same as that of multilayer DNNs.
The Fit-DNN of a single neuron folds the topological complexity of a feed-forward multilayer neural network into the time domain through a delay ring structure.
This latency system inherently has an infinitely large phase space, so just having one neuron with feedback is enough to fold the entire network.
Fit-DNN's computer vision functional test
The researchers used Fit-DNN for image noise reduction, i.e., reconstructing the original image from the noisy version.
They added Gaussian noise with an intensity of 1 to the images in the Fashion-MNIST dataset and treated Gaussian noise as a vector with values between 0 (white) and 1 (black).
The resulting vector entries are then truncated at thresholds 0 and 1 to obtain a noisy grayscale image.
As shown in the figure above, row a contains the original image from the Fashion-MNIST dataset; b behavior has the same image with additional Gaussian noise as input data for the training system. Line c represents the result of the reconstruction of the obtained original image.
It can be seen that the image recovery effect of Fit-DNN is good.
But the real question for Fit-DNN is whether a single neuron in the time loop can produce the same output as billions of neurons.
To demonstrate the computational power of Fit-DNN and the time state, the researchers chose five image classification tasks: MNIST40, Fashion-MNIST41, CIFAR-10, CIFAR-100, and SVHN.
The experiment compared the performance of Fit-DNN in the above tasks when the number of nodes in the hidden layer is different (N=50, 100, 200, 400).
The results showed that for relatively simple MNIST and Fashion-MNIST tasks, individual neurons achieved high accuracy.
But for the more challenging CIFAR-10, CIFAR-100, and SVHN missions, the accuracy of individual neurons was lower.
It is worth noting that the Fit-DNN here uses only half of the available diagonals of the weight matrix. If you increase the number of nodes N, you will effectively improve performance.
Research team
Interested readers can poke the link below to continue digging deeper.
Ingo Fischer is one of the paper's co-authors. He received his PhD in semiconductor physics from the University of Phillips Marburg and later worked as a postdoctoral fellow, assistant professor and full professor in engineering and physics at universities in various European countries.
Classic multilayer neural networks, such as the globally popular GPT-3, currently have 75 billion parameters, which is 100 times more than the amount of parameters of its predecessor, GPT-2.
It is estimated that just one GPT-3 training would require the equivalent of electricity used by 126 households in Denmark a year, or roughly the equivalent of driving to and from the moon.
The Berlin team's researchers believe fit-DNN can counter the rising energy costs of training powerful neural networks.
Scientists believe that as technology evolves, the system can scale to create an "infinite number" of neuronal connections from neurons suspended in time.
Thesis link: https://www.nature.com/articles/s41467-021-25427-4.pdf reference link: https://thenextweb.com/news/how-ai-brain-with-only-one-neuron-could-surpass-humans