laitimes

God's Embarrassment: Talk about the incomprehensibility of deep learning

author:Dr. Guojun Cao

From Kepler to Newton, we can see that science begins with statistics and is achieved by insight. Therefore, data statistical methods have always been regarded as primary capabilities or methods. Every revolutionary leap in science comes from deep insights, as is the case with Newton's classical mechanics, as well as with modern relativity and quantum mechanics.

At the performance level, this "insight" leads us to use concise mathematical formulas and methods to universally describe the essential laws of the movement of things. Yang Zhenning believes that this "beauty" of science comes from the "subtlety" of the universe itself. The pursuit of this subtle beauty is a scientific tradition that began with Newton. Even in quantum mechanics, which caused Einstein great confusion, its Schrödinger equation describing the motion of microscopic particles is quite concise and beautiful in form:

God's Embarrassment: Talk about the incomprehensibility of deep learning

Therefore, some people have some criticism and disdain for the dressed artificial intelligence that relies heavily on the statistical methods of the increasingly complex Xialiba people. The picture below is a caricature that satirizes this situation.

God's Embarrassment: Talk about the incomprehensibility of deep learning

Statistical methods, machine learning and artificial intelligence

Why can't we base AI on subtle and universal mathematical expressions, as traditional sciences such as physics do?

When artificial intelligence was proposed in 1957, scholars developed the field according to this person's thinking. In the 5th and 60s of the 20th century, scholars were trying to find simple and clear basic intelligence mechanisms, hoping to build different types of artificial intelligence systems on top of this. At that time, the so-called symbolism, connectionism, behaviorism and other schools appeared, all trying to follow this path. By the 70s, these genres were gradually in trouble. At this time, two things appeared in the field of artificial intelligence that had a far-reaching impact on the follow-up.

One is that scholars who study the mechanism of conscious movement of thinking have created a new discipline "cognitive science" to explore the internal mechanism of human consciousness activities;

Another thing is the emergence of pragmatic-oriented "expert systems". Expert systems are based on the domain knowledge of human experts to solve problems in a specific field. Scholars engaged in expert systems no longer aim to understand or replicate human general intelligence, but focus on solving practical problems with realistic methods. So the expert system has been rejected by "fundamentalist" AI scholars. But it does allow AI to take an important step from ideal to reality.

In the decades that followed, AI increasingly focused on solving practical problems. Until the second decade of the new century, "brute force computing" turned into gold, so that statistical methods represented by deep learning have played a role in many fields, and artificial intelligence has finally become one of the few technical commanding heights of global attention under the efforts of generations of scholars.

From the process described above, it is not difficult to find that because the human brain's insight ability has never explored the essential law of consciousness when returning light to itself. After various attempts, artificial intelligence has returned to the starting point of scientific development of statistical methods with the help of "brute force computing".

Since the birth of modern science, especially after the explosive breakthrough of science in the 20th century, we have done things beyond the traditional craftsman experience method, and everything must not only know the fact, but also know the reason. Kepler's work was "knowing what happened," and Newton's job was "knowing why." So although artificial intelligence returns to statistical methods and creates amazing brilliance, in fact, this is more of a helpless choice.

A major confusion brought about by this helplessness is the "uninterpretability" of artificial intelligence, the "uninterpretability" of deep learning statistical models based on artificial neural networks, or to put it mildly, the "interpretability" poor. The so-called "unexplainability" means that we don't know exactly what "knowledge" the deep learning model can learn in the data through "learning", we don't know how it uses this "knowledge" to solve problems, and of course we don't understand why if it fails. So some people call it the "black box model".

The term "black box" here is different from the traditional meaning of material technology and tools. The traditional black box is when we cannot see the processes inside a system, but can only observe it from the inputs and outputs of the system. This is not the case with deep learning models. The details of every step of its internal process are designed manually and are clear at a glance. In this sense, it is actually a "white box" in the traditional sense. The confusion is that we can't relate its internal computation to its overall function to understand how it can behave. This kind of incomprehensible "white box" basically does not exist in the field of material technology and tools.

At NIPS (Conference and Workshop on Neural Information Processing Systems) 2017, there was a very heated debate on the topic of "whether interpretability is necessary in machine learning". Some scholars at the time, including Yann LeCun, argued that this "unexplainability" was irrelevant. Because if artificial intelligence is only seen as a tool for solving problems, then "black cats and white cats are good cats when they catch rats." In fact, that's how we are using deep learning now, and we've created many miracles.

But there is another side to the story, which makes many people unable to let go of "unexplainability".

First of all, when we do not understand how a statistical model does its job, and the model can have many variations to choose from, one of the first difficulties we encounter is that we cannot analyze and infer how to choose a particular model to solve a specific problem. We can only rely on experience and experiment to find the structure and size of the model. And this artisan practice is inconsistent with the mainstream of modern science and technology. The efficiency and cost of its application is something we hope to improve dramatically.

Secondly, this uninterpretability leads us to use statistical calculations to determine the parameters of the model, that is, when the so-called model is "learning", we do not know whether the statistical calculation method used, that is, the learning algorithm, can effectively obtain the parameters that make the model have the best performance, so we can only estimate its effectiveness through craftsman-like experiments, and determine how to improve the corresponding algorithm.

And most importantly, it involves the issue of "trust" when using it. When we say "trust," we do not refer to the past that has already happened, but to things that have not yet happened in the future. "Trust" a tool means that for the future, we believe that it will still behave as it did in the past without surprises. If we don't understand how a method works, how it produces existing results, even if it has performed very well in the past, we can't simply say the word "trust", especially for more complex application scenarios. To "know why" is the "rational" pursuit that science has engraved on our hearts for hundreds of years. In this way, when faced with some life-and-death and responsibility-bearing issues, it is difficult for us to make up our minds to use such "unexplainable" methods.

Some other statistical models used in machine learning also have poor "interpretability" to varying degrees, and deep learning models are just one of the most serious performers. Therefore, obtaining the interpretability of the model has naturally become a highly concerned issue. Over the years, a large number of scholars have done a lot of work from different angles (Li Lingmin et al., "A Review of Explainability Research on Deep Learning", Computer Applications, 2022, 42(12): 3639---3650), but there has been no decisive breakthrough. So much so that some people joked: "Your circle is very messy." The "chaotic" explanation is still limited to the appearance, but does not go deep into the essence.

In fact, in the field of machine learning, there is no uniform standard for "interpretability". The interpretability of machine learning models can be roughly divided into two types: Intrinsic Interpretability and Post Hoc Interpretability.

Intrinsic interpretability refers to the understanding of the model's own learning and working mechanism. For deep learning models, this matter is extremely difficult, so there is relatively little related research work; Explainable after the fact means that we analyze the working process of a model after its training is completed, which is more of a case-by-case explanation. It's always easier to talk about things, so there are many papers that explore it.

The consciousness in the human brain is a black hole that we have not yet been able to effectively observe and measure, so it is completely understandable that we have no clue about the mechanism inside. However, the deep learning model, which is designed by humans themselves, and is traditionally "white" can no longer be white, after a lot of efforts we still find it difficult to "understand" it, which is really one of the strange wonders in the field of modern science and technology, the embarrassment of the creator of human beings.

God's Embarrassment: Talk about the incomprehensibility of deep learning

Perhaps we are caught in the shackles of an unconscious self: we have been trying to dissect it with the knowledge and methods that we already have, trying to explain it in the framework of existing science and technology. The most basic manifestation is that we take it for granted as a function map.

Looking back at the history of the development of science, in the Galileo era, we got rid of the logic of theology and began to understand the world with a new scientific perspective and method, thus establishing science; At the turn of the 20th century, because blackbody radiation and the Michelson-Morley experiment could not be explained within the existing scientific framework, we broke through the traditional scientific determination that physical quantities must be "continuously" infinitely separable and Newton's assumption of absolute space-time, respectively, and created quantum mechanics and relativity.

Is the existing scientific and technological framework, which was developed from the starting point of the description of material phenomena, suitable for cognitive activities? At the bottom of the human brain are material processes, and we still cannot understand how this process produces consciousness; The underlying deep learning model is numerical computation, and we have not yet understood how those calculations form its overall conscious function. The two are strikingly similar in terms of the rupture from the bottom to the top, does it hide some kind of common secret?

God's Embarrassment: Talk about the incomprehensibility of deep learning

Phenomena or problems that cannot be explained by theories and knowledge within existing frameworks have always been the dream of frontier scientific explorers, because they are often valuable opportunities for revolutionary human progress. Perhaps, the "uninterpretability" of deep learning models based on artificial neural networks is a problem in the field of "exoconsciousness" beyond the existing framework of science and technology, which requires us to start from a completely new foundation and establish a new set of theories based on a new set of methods? Just as Newton built the edifice of mechanics on his newly invented calculus. Perhaps this effort can open the door to a new frontier of knowledge?

If future explorations prove this hypothesis, it shouldn't come as a surprise. Because artificial neural network models, from the source, are not the result of analysis and reasoning based on the existing scientific framework. It was created by analogy inspired by the highly interconnected structure of neurons in the human brain, rather than the result of analytical reasoning based on existing scientific frameworks. This path was once dubbed "connectionism." It can be seen from this term that it did not take traditional mathematical methods, such as function mapping theory, as its basis from the beginning. Of course, apart from the word "connectionism", many technical methods, and the widespread application of "brute force computing", it has not been able to make practical progress in theory.

God's Embarrassment: Talk about the incomprehensibility of deep learning

This breakthrough of the existing framework requires Newton-like insight to establish a new conceptual foundation and corresponding methods through subtle observation of the learning process of deep learning models. Instead of putting aside the underlying real process, metaphysically do some empty conjectures with no information content. The now frequently used word "emerge" is a product of this. It describes the phenomenon in which a large number of group activities produce certain results, but its introduction does not provide any useful help in understanding this process, except to make the user appear learned to the layman.

In the traditional field of science and technology, the introduction of such meaningless "new concepts" is not in line with basic scientific norms. This phenomenon, which is relatively widespread in the field of artificial intelligence and has no evidence, shows that the field is still in the "pre-scientific" era.

Every revolutionary progress in science is often the metaphysical product of philosophical ideas that are talked about and made a big fuss by future generations. In fact, every scientific revolution begins with the face of specific problems and the analysis of real processes, rather than the product of ideas. Practice produces true knowledge, which is the basic mode of human understanding of the world.

God's Embarrassment: Talk about the incomprehensibility of deep learning

"Extra-consciousness" is implemented by algorithms, so the understanding of any type of "extra-consciousness" should follow the above principle, that is, starting from the basic process of the algorithm to analyze what it does, why it can do it, and how it does it. Any "interpretation" that deviates from the basic process of algorithms is a self-absorbed conjecture and does not contribute to the development of information science and technology.

Read on