As we all know, there are roughly three schools of research in artificial intelligence (AI): symbolism, connectionism, and behaviorism.
Simply put, symbolism is mainly to build an artificial intelligence system using axioms and logical systems. Connectionism advocates imitating the human neuronal bonding mechanism to achieve artificial intelligence. Behaviorism believes that intelligence depends on perception and action, and feedback from the environment contributes to the improvement of intelligence.
Different research perspectives derive different academic schools.
However, these seemingly distinct isms, if viewed from a higher abstract perspective, actually have a common feature - they are all studying the relationship between things.
Since 2012, connectionism represented by deep learning has flourished and has been widely used in computer vision, natural language processing and other fields. As mentioned earlier, deep learning works well in many areas, but because it is a typical "end-to-end" black-box model, there is Achilles' Heel—there is no rational explanation for the predicted results. In the search for truth, people are not allowed to wander for too long in the irrational prosperity of "knowing what is and not knowing why." You know, the pursuit of cause and effect may be an important solace for human beings to maintain mental stability.
At present, the harvest of the technical dividend of deep learning has become saturated, the feast has passed, and Ragnarok needs to incorporate new theories to break through the ceiling of AI. Many researchers who are on the front line of scientific research are keenly aware of this. In November 2019, at the keynote speech of the famous academic conference NeurIPS (Conference on Neural Information Processing Systems), Turing Award winner Yoshua Bengio borrowed the concepts of "System 1" (fast, intuitive, unconscious) and "System 2" (slow, logical, conscious) proposed by the famous psychologist Kahneman. It is pointed out that the future of deep learning should be towards "System 2", as shown in the figure below.
Artificial intelligence represented by deep learning has reached or surpassed human standards in the field of perceptual intelligence such as "listening, speaking, and seeing", but it is still in its infancy in the field of cognitive intelligence that requires external knowledge, logical reasoning or domain transfer. In other words, the research of deep learning should gradually evolve from perceptual intelligence to cognition-based logical reasoning.
Obviously, logical reasoning clearly falls within the scope of symbolism. Symbolism, which has been "slowed for a long time", seems to be brewing the emotional expression of "winter is here, will spring be far away". But how to express logical reasoning? In addition to traditional symbols, graphs are also a promising form of expression.
Coincidentally, in 2020, Academician Zhang Cymbal of Tsinghua University wrote an article in the 9th issue of Science in China proposing that we should "move towards the third generation of artificial intelligence". He believes that the idea of the development of the third generation of artificial intelligence is to combine the first generation of knowledge-driven (symbolic intelligence) and the second generation of data-driven (perceptual intelligence), and construct a more powerful third-generation AI (knowledge intelligence) through the collaborative use of knowledge, data, algorithms and computing power.
Professor Tang Jie of Tsinghua University gave an exploration path to achieve cognitive intelligence in the theme report of "The Next Decade of Artificial Intelligence". Obviously, the map needs to be expressed in graphs.
In September 2020, at the "16th National High Performance Computing Annual Conference" held in Zhengzhou, China, Academician Chen Zuoning gave a keynote report entitled "Analysis of Computing Power Requirements for Artificial Intelligence Progress". In the report, Academician Chen Zuoning also believes that the three major schools of artificial intelligence are increasingly integrated and develop collaboratively. She concludes: "The core feature of AI is the study of 'relationships'. There are three manifestations of "relationship": one is the connection relationship, such as the connection between neurons in the neural network, the gradient propagation in the backpropagation algorithm, and the variation in the evolutionary algorithm; The second is logical relationships, such as circular connections in RNNs (recurrent neural networks) and reasoning relationships in knowledge graphs; The third is causality, such as Bayes, decision trees, and control connections in reinforcement learning.
We often say that "in a world with people, there are rivers and lakes". Similarly, where there is a "relationship", there is a diagram. Everything is connected. If one of the core characteristics of AI is "relationships," who would be the preferred tool for describing them? Of course, it's the picture! As a general data structure, graphs can well describe the relationships between entities.
Although the technical dividend of deep learning is "thinning", it will not "stop abruptly". According to the philosopher of science and technology, Kevin Kelly, technology is another kind of life, which has its own "Teme", and in order to maintain vitality, it will constantly evolve to achieve the survival of the fittest in technology itself. The same will be true of deep learning, as a cutting-edge AI technology.
The combination of neural network technology and graph theory has created Graph Neural Networks (GNN), which is a rational catharsis of technological trends. We can think that GNN is the root of graph in the field of deep learning, and it can also be understood that GNN is the pioneering territory of deep learning in the field of graph data. In this way, you can roughly outline what the graph neural network looks like.
The era of graph neural networks is coming
"Graph" data is the object processed by the graph neural network, in the Chinese context, "graph" and "image" feel similar, but in fact, they are not, let's first sort out the difference between the two.
01
A figure is very different from an image
In English, the image is Image and the graph is Graph, and the two are very different and easy to distinguish. But in Chinese context, the two are often confused. The visual difference between an image and a graph is shown in the following figure. As shown in Figure (a), in terms of data representation, the image is based on a lattice, which is a kind of data based on a grid (Grid), and its expression relies on pixels (Pixel). The diagram is different, it is composed of several nodes (Nodes) and edges (Edges) connecting nodes, which are used to express the relationship between different entities, as shown in Figure (b). The data that describes these entity relationships is the graph data.
(a) An image composed of pixels (b) A graph composed of nodes and edges
As a data structure that efficiently describes the relationship between entities, graphs play an increasingly important role in data analysis. Many computational problems involving relationships can be transformed into a graph-oriented computational problem. For example, in the fields of social network analysis, recommendation network analysis, disease transmission exploration, gene expression network analysis, and cell similarity analysis, graphs are widely used.
For example, the molecular formula can be thought of as a diagram. All the particles in a molecule are interacting with each other, but when a pair of atoms maintain a steady distance from each other, we say they share a covalent bond. Different atoms and chemical bonds have different distances. The 3D topology within the molecule can be described graphically, where the nodes are atoms and the edges are covalent bonds. The 3D representation of the molecule and the graphical representation of the molecule are shown in the figure below.
3D representation of molecules Graphical representation of molecules
For example, social networks relationships can also be represented as a graph. Social networks are tools for studying the collective behavior patterns of humans, institutions, and organizations. We can build a graph representing a population by modeling individuals as nodes and their relationships as edges.
With the development of mobile Internet, Internet of Things and social networks, many emerging applications are generating and accumulating a large amount of graph data in an unprecedented way and speed (see figure below), and how to analyze and use these data has become an opportunity and challenge in many fields.
02
The essence of graph neural networks
Next, we will explore the nature of graph neural networks at the macro level. Graph neural network is a specific way of machine learning and is a natural extension of neural networks in graph data applications. We know that machine learning, in form, can be approximately equivalent to finding a functional function about a specific input and expected output through statistical or inferential methods. Generally, we write the input variable (feature) space as uppercase and the output variable space as uppercase. Thus, machine learning is formally similar to finding a useful function:
Specific to graph neural network learning, its essence cannot be separated from the above categories. In essence, its task is also to build a function map, for a specific graph data X, after data preprocessing, data transformation, and then according to some learned rules, give an output Y (such as classification information or regression value). The essence of graph neural networks is shown in the figure below. The question is, how do you find such a mapping? As a result, various graph neural network algorithms came into being, and eight immortals crossed the sea, each showing their own powers.
Graph neural network is a technology that combines graph data processing and deep learning networks. It first uses graphs to express "intricate" relationships, and when nodes partially aggregate other node information in a certain way, then do "deep processing" of data, which can be used for tasks such as classification, regression or clustering.
Graphs do have a wide range of application scenarios, but compared with traditional raster data, the representation of graphs is more complex, and it is more difficult to process it, so graph neural networks also face some difficulties and challenges in their application.
But we do not need to be afraid of these difficulties and challenges, especially since the era of AIGC has come, we should dare to embrace AI technology. As a powerful natural language generative model, ChatGPT is essentially a high-order deep learning model, and ChatGPT can be combined with the concept and technology of graph neural networks to further expand its application fields. This combination helps improve ChatGPT's performance in areas such as knowledge graphs, intelligent question answering and conversation systems, providing users with a smarter and more personalized interactive experience.
Read From Deep Learning to Graph Neural Networks: Models and Practice to learn more.