laitimes

In-depth long article: six schools of general artificial intelligence

author:Yi Dingbang

The concept of Artificial General Intelligence (AGI) was first proposed by John McCarthy and his colleagues at the 1956 Dartmouth Conference, where AI researchers John McCarthy, Marvin Minsky, and Alan Newell first discussed building machine systems comparable to human generalized intelligence.

AGI and AI are two closely related but different concepts: AI, full name artificial intelligence, is a new technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and enhancing human intelligence. As a relatively broad concept, artificial intelligence covers many subfields, such as machine learning, deep learning, speech recognition, computer vision, etc. AGI, known as General Artificial Intelligence, is a subset of artificial intelligence. It pursues that artificial intelligence systems have the same broad intelligence as humans, can learn, think and understand complex concepts as naturally as humans, and can have a relatively high degree of adaptability in various fields, not just limited to a certain professional field. AGI is also known as "strong AI" and "full AI" is one of the ultimate goals of artificial intelligence development.

Since the concept of "general artificial intelligence" was proposed in the middle of the 20th century, the development of AGI has gone through four stages: origin, gestation, prosperity and integration.

In the initial origin stage, the concept of AGI was first proposed, and researchers began to explore symbolic logic, rule systems, and rules of thumb to implement artificial intelligence. However, these early methods struggled to deal with massive amounts of knowledge and complexity, and AGI implementation was still far away. In the 80s of the 20th century, machine learning and knowledge processing methods flourished, and neural networks, evolutionary computing, Bayesian methods and knowledge engineering laid a theoretical and technical foundation for the development of AGI, marking that AGI entered the gestation stage. The various methods bred during this period have accumulated valuable experience for the development of AGI and expanded the path to implementation.

At the beginning of the 21st century, deep neural networks, reinforcement learning and knowledge graph and other technologies have achieved breakthroughs, and the support of big data and high-performance computers has accelerated the development and application of various methods. AGI research has made great progress in both breadth and depth, and has entered a stage of vigorous development. In recent years, it has been difficult to solve the problems faced by AGI implementation by a single method. Cross-genre integration has become a new trend, trying to integrate neural networks, knowledge engineering and logical reasoning through new methods such as neural symbol system, neural logical reasoning and Bayesian deep learning to promote AGI to achieve new progress. This marks the official entry of AGI development into the integration phase.

The development of AGI has gone through more than half a century, and these four stages of development have gradually formed. In the future, extensive interdisciplinary cooperation and theoretical innovation will be the key to achieving the ultimate goal of general artificial intelligence. The deep integration of various methods may produce completely different new trends in development and point out a new path for the further development of AGI. Below we will briefly introduce the characteristics of these six genres.

Whole Brain Emulation

Whole-brain simulation is one of the main theoretical schools in the field of artificial intelligence research to achieve general artificial intelligence (AGI). The idea of this school is that by simulating the connection patterns of human brain neural networks and using machine learning algorithms, artificial intelligence systems that can be developed that are equal to or even exceed human intelligence. To realize this vision, the researchers mainly rely on two technical means: neural networks and machine learning.

The advantages of this genre are:

  • The human brain is the only known system with human-level intelligence. By simulating the human brain, the essence of human intelligence can be absorbed to the maximum extent, and the risk of missing or ignoring these key elements in the process of developing AGI can be reduced.
  • The human brain has evolved an efficient and stable neural computing framework over a long period of evolution. Simulating this framework largely avoids the risk of design flaws and runaway AI systems.
  • Unlocking the mysteries of the human brain allows us to understand human intelligence at its roots, which will provide fresh ideas and inspiration for various schools of research. To understand life is to control it.

A neural network is an information processing system that allows for highly parallel and distributed information processing by simulating the way neurons in the human brain are connected. Artificial neural networks consist of a large number of simple processing units (simulated neurons) and weighted connections between them (simulated synaptic connections). This network structure allows neural networks to exhibit several important characteristics:

  • Store distributed knowledge. Knowledge is stored in the weighted values of each connection to the network.
  • Fault tolerance. Partial connectivity or node damage does not cause the entire system to fail.
  • Parallel processing. Each node can process information at the same time.
  • Ability to learn. The connection weighted values can be varied by training cases to accommodate the mapping between input and output.

Deep learning uses multi-hidden layer neural network structure to achieve feature learning and pattern recognition through large-scale training data. It breaks through the limitations of artificial neural networks in structural design and training algorithms, makes the neural network model develop in a deeper and broader direction, and greatly improves the expression ability and learning ability. Convolutional neural networks, recurrent neural networks, and reinforcement learning are all successful applications of deep learning.

Machine learning provides the algorithms and methods needed to train neural networks. By repeatedly receiving input data, predicting output and correcting errors, machine learning algorithms can continuously correct the weighted values of each connection of the neural network, so as to realize the network's learning and adaptation to the complex correspondence between input and output.

Neural network and deep learning are the key technical means for AGI implementation. In the future, larger and deeper network structures and more powerful machine learning algorithms may realize the leap from artificial intelligence to general artificial intelligence. This will also be one of the main ways to achieve AGI.

In-depth long article: six schools of general artificial intelligence

Whole-brain simulation, like Scarlet Witch, has a powerful telepathic stress to explore the mysteries of human thinking, trying to enter the temple of artificial intelligence by simulating the unparalleled neural network of the human brain, and realize the transformation from data to mind. It dreams that by restoring and reproducing the intricate connections and interactions of brain neurons, the machine can acquire the ability to learn, understand and reason like humans, and eventually awaken its own thinking and personality.

Evolutionary Intelligence

Evolutionary intelligence is another important path and theoretical school to achieve AGI. Its core idea is that by simulating the process of biological evolution, large-scale neural networks can self-organize, self-learn and self-optimize in complex environments, and eventually achieve intelligence equal to or higher than humans.

The advantages of this approach are:

  • Evolution is the only natural process known to produce intelligent life. Simulating this process allows AI systems to evolve in a more biologically similar and controllable way.
  • The evolutionary process can be used by the system to automatically discover the best structures and algorithms to solve complex problems. Through the competition of survival of the fittest, the artificial intelligence system continues to improve and learn in a brain-like manner.
  • Evolutionarily trained neural networks have strong generalization ability and robustness. They handle inputs outside of the training data well and have a harder time "forgetting" what you've learned.

However, this method also has some difficulties: to achieve human-level intelligence, the evolutionary process requires huge iterations and ultra-large-scale computing, which faces bottlenecks in technology and computing resources. It is difficult to confirm whether evolved neural networks will necessarily reach AGI, nor can we be completely sure how they behave and how rational they are. This poses certain security risks. The operation of evolutionary algorithms is poorly controlled, and the structure and algorithm of neural networks cannot be accurately designed manually. The design of AGI systems, on the other hand, requires fine-tuning to ensure consistency and not spin out of control.

Currently, DeepMind is representative of this school of research. The AlphaGo and AlphaZero they developed are neural networks trained using evolutionary algorithms that have demonstrated superhuman intelligence in local domains. But to achieve AGI, it is necessary to solve the above difficulties and greatly promote algorithms and computing power, which may take a long time. However, the potential of evolutionary intelligence as an efficient path to biological inspiration cannot be underestimated.

In-depth long article: six schools of general artificial intelligence

Evolutionary computing, like mutants, represents the future of biological evolution, trying to endow machines with unique adaptability and latent intelligence by simulating the emergence of evolution, a natural intelligence. Evolutionary computing dreams of building machines into the heirs and future of nature's evolution through random variation and the alchemy of natural selection, allowing them to autonomously evolve strange superpowers to solve complex problems in the ocean of data. It aspires to create a new species through the magic of evolution—artificial life with self-learning, self-organization and self-optimization capabilities, which will gain their own wisdom and mechanism through the power of evolution beyond our plans and expectations.

Bayesian Program Learning

Bayesian programming learning is another important research school for AGI implementation. Its core idea is to allow AI systems to learn independently and build knowledge through Bayesian inference and probability graph models, and eventually achieve generalized intelligence close to the human level.

The advantages of the Bayesian school over several other methods are:

  • Bayesian theory provides a probabilistic identification and judgment mechanism. This allows AI systems to reason and make decisions in uncertain and incomplete information, closer to the way humans know.
  • Bayesian networks enable automatic feature learning and knowledge discovery. This gives AI systems the ability to independently build knowledge structures and learn unhelpful information, which is necessary for AGI implementation.
  • Bayesian methods enable natural language understanding and generation. By learning the laws of probability and statistics in language, machine translation, dialogue generation and other functions can be realized, which is helpful for human-computer interaction and AGI.
  • Bayesian models are highly interpretable. Through the probability calculation in the inference process, the reason why the AI system makes a judgment or generates a corresponding output can be explained, which enhances the reliability and controllability of the system.

However, the genre also presents some challenges:

  • Bayesian theory has high computational complexity and is difficult to apply to ultra-large neural networks and knowledge systems. This limits its role in AGI implementation.
  • Bayesian methods struggle to handle dynamic and uncertain environments. The AGI system needs to learn and make decisions in an open and complex environment, which is the shortcoming of this method.
  • The performance of Bayesian networks depends on manually set probability distributions and parameters, which limits their learning automaticity and generalization ability.
  • Bayesian models are difficult to integrate well with other machine learning methods such as deep neural networks, which hinders the comprehensive use of multiple methods.

OpenAI, Anthropic, and Google's DeepMind also use this approach in the implementation of the technology. They have made contributions to natural language processing, machine learning and AGI theory. But to achieve true AGI, the genre needs to further promote theoretical and technical research, especially to solve problems such as computability and environmental adaptability, which will take a long time. Bayesian methods are a valuable exploration direction for achieving AGI, but may be more suitable as an auxiliary means to achieve the goal alone.

In-depth long article: six schools of general artificial intelligence

Bayesian program learning is like Iron Man, using probabilistic calculations and mathematical models to explore ways to make decisions in an incomplete environment. It attempts to use Bayes' theorem, the soul of statistics, to build machines into efficient reasoners who make judgments in massive amounts of data. Bayesian program learning aspires to build artificial intelligence systems into navigators in data and pioneers in unknown fields through computer science and engineering of probability and uncertainty, so that they can rely on AI systems to make rapid responses and decisions in complex and changeable environments like Iron Man, and become the best partners for human cooperation and survival.

Symbolic Logic & Knowledge Graphs

Symbolic logic and knowledge graph are another important research school to achieve AGI. Its core idea is to achieve intelligence similar to human beings by building a broad and deep knowledge system and logical reasoning mechanism. Symbolic logic belongs to a knowledge representation and reasoning method, which has its own theoretical system, inference rules and calculation models. Knowledge graph belongs to a kind of knowledge engineering technology and product, has not yet formed a unified and complete theoretical system, different knowledge map in the construction method, knowledge source, organizational structure and other aspects of the difference also have differences.

The advantages of this school of technology are:

  • The knowledge graph can represent a wide range of human world knowledge, which is the knowledge base necessary for the operation of the agent. Building a complete knowledge graph is the cornerstone of AGI.
  • Logical reasoning can carry out high-level complex reasoning and knowledge discovery. This allows AI systems to have human-level rational thinking and reasoning capabilities that can solve complex problems.
  • The AI system implemented by this method exhibits high interpretability. Through the logical reasoning path, humans can understand why they make decisions and produce outputs. This helps ensure that the system is secure and controllable.

The MIT Artificial Intelligence Laboratory has made outstanding progress in this regard, and has made certain explorations in knowledge representation, automatic reasoning and machine understanding. However, this method also has the following difficulties:

  • Building a complete knowledge graph and enabling efficient reasoning is a daunting task. We still do not have the theory and technology to achieve scale, thus limiting its role in AGI implementation.
  • Purely logical reasoning makes it difficult to deal with uncertainty and probability scenarios. AGI systems need to operate flexibly in uncertain environments, which is the shortcoming of this approach.
  • Symbolic logic and knowledge graph are difficult to combine with machine learning, neural networks and other methods. But the implementation of AGI requires the synergy of multiple approaches, thus hindering the development of this genre.
  • The knowledge and output generated by this approach often appear blunt and lack common sense. There is still a big gap to reach the breadth and depth of human beings.

Symbolic logic and knowledge graph lay the theoretical cornerstone for artificial intelligence, and also provide a valuable exploration direction for the realization of AGI. But to reach the top, it will need to be widely integrated with other methods, especially neural networks and machine learning. This requires interdisciplinary efforts and theoretical innovation, and perhaps the best path to achieve general artificial intelligence can be found. This is also the key to the development of this genre, and it also deserves widespread attention and investment in the field of artificial intelligence.

In-depth long article: six schools of general artificial intelligence

Symbolic logic and knowledge graph are like Star-Lord, trying to explore the mysteries of artificial intelligence through the organization and management of knowledge. It aspires to build machines into explorers of data and information through the knowledge system, a navigation map of cosmic truth. Symbolic logic and knowledge graph dream Through the magic of knowledge engineering, the artificial intelligence system will be built into a navigator with the coordinates of the knowledge universe like Star-Lord, so that it can navigate freely in the vast ocean of knowledge, and answer our unsolved mysteries through the correlation and reasoning between knowledge. It represents an idea of exploring a new world of artificial intelligence through knowledge logic and system, hoping to reshape artificial intelligence through the power and rules of knowledge and achieve a higher level of understanding and communication between man and machine.

Integrated Learning

Ensemble learning is the most likely path to AGI. Its core idea is to build a broad and in-depth learning and reasoning system by integrating a variety of machine learning methods, such as neural networks, evolutionary algorithms, Bayesian methods, etc., so as to achieve human-like intelligence.

This genre makes extensive use of machine learning methods such as decision tree methods, support vector machines, Bayesian methods, neural networks, cluster analysis, transfer learning, autoregressive models, etc., and the advantages of this school are:

  • Ensemble learning can combine the strengths of various approaches, compensate for the shortcomings of individual methods, and achieve more powerful intelligence. A single approach is difficult to solve the challenges AGI faces.
  • The collaboration of multiple learning mechanisms can accelerate the learning process and achieve broader knowledge acquisition and understanding. This helps to shorten the time required to achieve AGI.
  • The integrated system can choose the appropriate approach according to the complexity of the task and achieve the best solution. This enhances the adaptability and efficiency of the system.
  • The ensemble approach can simulate synergy between various regions of the human brain, which helps build AI systems that are more similar to the human brain.

The characteristics of the ensemble learning school are not as obvious as other schools, and it fuses multiple machine learning algorithms or models to achieve joint learning and inference. In addition, it does not explicitly correspond to a specific model structure or learning framework like the neural network school, evolutionary intelligence school, etc. Ensemble learning emphasizes an idea and concept that improves generalization performance and robustness by combining multiple learning models. This idea has had a wide impact and is also regarded as one of the important paradigms for achieving AGI.

In-depth long article: six schools of general artificial intelligence

Ensemble learning, like Captain America, tries to explore the mysteries of artificial intelligence through a combination of different methods and perspectives. It aspires to study the integration and collaboration of multiple paths through artificial intelligence to achieve the effect that the whole is greater than each part. Through a consortium of different algorithms and theories, the integrated learning dream builds the artificial intelligence system into a superhero like Captain America who is good at coordinating the strengths of various heroes and uniting individual forces under the team's goals. It represents an idea of reshaping artificial intelligence through the integration of algorithms and methods, hoping to achieve a leap in artificial intelligence research through the complementarity of multi-faceted means, and ultimately achieve the deep integration and collaboration of man and machine.

Large Language Models (LLMs)

A large language model refers to a deep learning model trained with large amounts of text data that can generate natural language text or understand the meaning of language text. Large language models can handle a variety of natural language tasks, such as text classification, question answering, dialogue, etc., which is an important path to artificial intelligence.

LLMs use neural networks and massive data to realize unsupervised learning of natural language, and extract the statistical rules and features of language to understand language and generate language. Representative models include OpenAI's GPT series, Google's BERT series, Meta's RoBERTa series, Baidu's Wenxin Yiyan, Fudan University's MOSS, Ali's Tongyi Qianwen, Huawei's Pangu model, etc.

The technology and application of big language model has made great progress in recent years, which has created a new natural language understanding and generation mode, which has brought revolutionary impact to artificial intelligence, and has become an active frontier and hot field of machine learning and natural language processing. We'll focus on the following sections.

The big language model genre mainly has the following characteristics:

  • Based on deep neural networks. Large language models use deep neural networks, especially transformers, and other structures to learn language representation and generation. This is an important application of deep learning techniques.
  • Massive data-driven. Large language models require large-scale text datasets to train, typically billions of words or larger. This requires powerful computing power and optimization techniques.
  • Unsupervised learning. The big language model performs unsupervised learning through the statistical laws of the language itself, and does not need manually labeled data. This brings it closer to the way human language is acquired.
  • Language expression and comprehension. The big language model has both the ability to generate language expression and the reasoning ability of language understanding. These two abilities enhance each other and contribute to the establishment of a unified linguistic representation.
  • Multi-task learning. Large language models can realize multi-task learning and transfer by fine-tuning on different natural language processing tasks. This improves the generalization ability and applicability of the model.
  • Standard model. Large language models are widely used to improve the performance of other natural language comprehension tasks and belong to a standard pre-training technique or standard model. This greatly reduces the cost and difficulty of R&D.
  • Open technology. Big language models combined with deep learning and Transformer have revolutionized natural language understanding technology and belong to an open and rapidly developing field. Related technologies and applications continue to emerge, which makes it dynamic and forward-looking.

The advantages of large language models are also obvious:

  • Wide range of applications. Large language models have achieved remarkable results in many natural language understanding tasks, such as machine translation, question answering systems, and dialogue systems. The model pre-trained with massive data has powerful language expression and comprehension capabilities.
  • Versatile. The Big Language Model is a general language understanding framework that can be applied to different downstream tasks. With multi-task learning or fine-tuning, the ability to represent learning can be transferred to new tasks.
  • Lower cost. Large language models can use massive unstructured language data for unsupervised pre-training, and do not require a large amount of manually labeled data. This greatly reduces the cost and difficulty of R&D.
  • Learn the essence of language. By analyzing large-scale language datasets, large language models can discover the statistical laws and internal structure of human language. This makes its understanding of language closer to humans, with similar mechanisms to human language use.
  • Computer Language Understanding. The big language model creates the ability of computer systems to understand natural language and enables automatic writing of computer programming languages. It has opened up the broad prospects of new technologies such as artificial intelligence interaction and recommendation systems, and also promoted the development of artificial intelligence.
  • Explainable and controllable. Although large language models have the opacity of deep neural networks, the introduction of related technologies, such as interpretable methods and adversarial sample techniques, is also accelerating. This makes the decision-making process of large language models clearer and more controllable, increasing their application potential on production systems.
  • Active research areas. Big language models are an active research field, and related technologies and applications are constantly emerging. It reflects its strong vitality and forward-looking, and also provides more opportunities and development space for researchers.
In-depth long article: six schools of general artificial intelligence

Big language models are like black widows, exploring the future of artificial intelligence through the mysteries of language. It aspires to train with massive corpus to build machines into masters of language and communicative artists. The big language model dreams of learning the laws of language through neural networks, and building artificial intelligence systems into super spies like Black Widow who are proficient in multi-Chinese words and good at communication. It represents an idea of reconstructing artificial intelligence through the essence of natural language understanding, hoping to achieve barrier-free dialogue and communication between humans and artificial intelligence through deep learning and understanding of language by machines.

The most impressive thing about the big language model school is its strong language expression and understanding ability, which allows many ordinary people to glimpse the charm of artificial intelligence and effectively promotes the public's attention to artificial intelligence.

However, some experts have a negative opinion on this genre, and at a summit on artificial intelligence held in Beijing in June 2023, computer scientist Stuart Russell, a professor at the University of California, Berkeley, said in a speech: "ChatGPT and GPT-4 are not 'answering' questions, they do not understand the world." General artificial intelligence has not yet arrived, and the big language model is just one piece of the puzzle, and we are not even sure what the puzzle will look like in the end. ”

Yann LeCun, another AI expert Turing Award winner, one of the "Big Three Deep Learning" and chief AI scientist at Meta, countered GPT: "Autoregressive models simply don't work because they don't have the ability to plan and reason. Generating autoregressive large language models based solely on probability cannot solve the problems of illusion and error in essence. As the input text increases, the probability of errors increases exponentially. ”

So if you want to lead to general artificial intelligence, where is the next step in artificial intelligence? Yang Likun's answer is the "world model", a model that not only mimics the human brain at the neural level, but also fully fits the world model of the human brain division in the cognitive module. The biggest difference between it and the big language model is that it can have planning and forecasting capabilities and cost accounting capabilities.

After decades of development, artificial intelligence has initially realized the ability to understand human language and vision, which has made it widely used in social life and business scenarios. This marks an important turning point in artificial intelligence from the laboratory to practical applications. The development of artificial intelligence still requires the integration and innovation of multiple methods, and still faces certain difficulties and challenges, but it will have a far-reaching impact and deserve our continuous attention and discussion.

Follow the author and take you to read artificial intelligence.

Read on