Image source @ Visual China
Wen | brain polar body
You may not have heard of the DIKW Pyramid, but you must have been rubbed against the chain of contempt of this tower.
A game streamer once described his prejudgment: the audience only saw the second layer and thought of the first layer, in fact I was on the fifth floor. As a result, netizens described some unexpected operations, "This wave, this wave is in the atmosphere."
Although this statement is a bit ridiculous, it really has some scientific truth.
The DIKW pyramid is a hierarchy of human understanding, reasoning, and interpretation, namely: data (the original set of facts), information (structured data that can be analyzed and measured), knowledge (which requires insight and comprehension to learn), and wisdom (to guide action).
The person standing at the tip of the DIKW pyramid is equivalent to the top player of all the clearances, who has mastered the data, sorted it into information, understood it as knowledge, and transformed it into wisdom, so that the action can be helped. Resourceful as Zhuge Liang, the one who uses the trick is called a handyman, definitely "a man standing in the atmosphere".
Does the DIKW pyramid apply to people and also to AI? The answer is yes.
If AI also has a chain of contempt, then data-based AI will definitely be crushed by knowledge-based AI.
This is because AI is A Knowledge Technology, ai is a knowledge-driven technology. Therefore, the process of developing from primary artificial intelligence to advanced artificial intelligence and general artificial intelligence is also a process of climbing the DIKW pyramid.
In recent years, many academic and industrial forces in the field of AI have moved from brute force computing that emphasizes "data miracles" to a higher level of the "knowledge pyramid", promoting knowledge computing to lead the future trend of AI applications.
Arguably, we are at a critical stage in the transition to knowledge-based AI. AI is already influencing every aspect of your and my lives, so it's worth talking about how AI will change as it climbs up the DIKW pyramid?
The Pendulum of Return: The Revival of Rationalism
Applying knowledge to machine intelligence is nothing new. As early as the last century, human beings began to explore the pace of knowledge computing, and widely applied to work and life.
From the moment AI was born, it was the intersection of the two major schools of rationalism and empiricism. What they all have in common is that machine intelligence must first have knowledge, and knowledge is the core of intelligence; the difference is that the understanding and acquisition of knowledge are different.
With the development of these two major schools, the combination of knowledge and AI is manifested in two ways.
One is a combination of rationalism, where man provides knowledge and machines are responsible for calculations.
Rationalism believes that human intelligence is innately inherited, and to achieve machine intelligence, it is necessary to understand the operating mechanism of the human brain, summarize this thing into knowledge, and then let people tell the machine how to do it.
The typical application is an expert system.
Human experts summarize knowledge, and computers learn according to the expert system knowledge base, which is very interpretable. Since the successful development of the world's first expert system , the chemical expert system DENDRAL in 1968 , the early expert system for reasoning and analysis of a single field and imitating experts has become popular, and is widely used in many industrial computing scenarios such as industry and agriculture, medical treatment, meteorology, transportation, and military.
However, expert machines can only play a role in some specific areas, and the construction cost is very high. Moreover, limited by the upper limit of the expert's cognition, if people have not found that knowledge, or can not express it, the machine is even less likely to learn.
So from the 1990s to the present, another mode of combining AI and knowledge has dominated the mainstream, that is, empiricism.
By creating a classifier by hand, developers do not have to know the answer in advance, the machine can not rely on those human experts can not describe, "can only understand the ineffable" knowledge, according to their own operating mechanism, from the data to mine knowledge, through large-scale data training model parameters, showing more than human intelligence.
The most representative is deep learning.
Relying on powerful data, computing power and neural networks, Google Brain can, without the help of humans, without knowing the word "cat", transform data into knowledge through training, after looking at millions of pictures, extract the basic characteristics of cats, know that cats are a furry (here omitted a bunch of adjectives) creatures, and then successfully identify cats in a bunch of photos.
Based on huge data, although AI does not really understand and master the relevant knowledge, that is, "know what it is and does not know why it is", it is impossible to really replace human experts, but it can decompose complex pattern recognition problems into simpler pattern recognition problems, and perform better and more efficiently than humans in some specific tasks, and have made great progress. Deep learning is also seen as the pinnacle of empiricism, becoming central to driving the third wave of AI.
However, there is an essential difference between data-based AI and knowledge-based AI. The famous Moravik paradox has long pointed out this problem, because machines cannot integrate implicit knowledge into thought and action like humans, forming high-order wisdom, so they have become giants of logic, dwarfs of common sense, and can surpass humans in some difficult problems such as Go, but in very simple cognitive problems, the performance is not as good as that of human children of four or five years old.
One of the solutions is what rationalism advocates, allowing machines to understand and think like real humans.
As Churchie predicted in A Pendulum Swung Too Far, AI has strayed too far from empiricism, and the faster it will return to rationalism in the future, the pace of rationalism's renaissance is coming.
The Call of the Industry: The Wave of Digital Intelligence and the Light of Knowledge
You may think that empiricism and rationalism are just schools of thought in academia, and have little to do with ordinary people and industry.
In fact, in the wave of industrial intelligence, more and more industries and organizations have begun to call for knowledge-based AI, which is because -
The model design phase requires knowledge-based understanding.
We know that AI has begun to move out of the lab and ivory tower, to the multiplicity, to the physical world and the biological world, and the data in these areas is not all composed of 1s and 0s.
For example, AI predicts the structure of proteins, each protein is not a simple image data, there is a specific meaning behind it. How different molecular relationships, how to interact, what principles to combine together, etc., there is a set of biological logic and knowledge system support, if the lack of understanding of pharmaceutical knowledge, the use of pure data-driven methods to design the model, it is likely to make the model can not play a role.
Therefore, if you want AI models to really play a valuable role in the industry, we must combine the mechanism model of actual work, expert knowledge, etc., and transform it into a mathematical language that AI can understand, process, and analyze.
The model training phase requires knowledge-based data.
In industrial AI, there is often a large amount of information in the data, that is, knowledge that is not or cannot be represented, which is often reflected in expert experience or mentor-apprentice inheritance. If you want to train an industrial model with better results, you not only need a large amount of complete data, but also be able to accurately describe the knowledge relationship between data, so that you can dig out more useful knowledge from the data.
Take the recommendation algorithm that we encounter every day, the traditional recommendation algorithm is what the user likes to recommend, and it is easy to fall into the information cocoon. A domestic scientific research team combines the knowledge graph of food nutrition science with the recommendation algorithm, and combines and recommends it according to user feedback data, such as clicks, interest preferences, body data, etc., combined with health knowledge.
Knowledge-based data can help create high-quality, more humane algorithms. In the case of the aforementioned recommendation system, compared to the algorithm that constantly caters to the user, it provides an option that meets both taste preferences and health management requirements. Imagine again, if AI can combine the behavior data of food delivery workers with people's common sense knowledge, perhaps the inner volume dilemma caused by the infinite squeeze of delivery time is also expected to be solved.
The model landing stage requires knowledge-based trust.
The application of AI models depends to a large extent on its reliability: first, credibility, whether the results are trusted, deep learning is limited by interpretability problems, and it is not as trusted as human experts in specialized fields such as medical treatment; second, reliability, whether it can also show better performance in the case of interference, that is, to solve the robustness problem.
Professor Zhang Cymbal, academician of the Chinese Academy of Sciences and dean of the Institute of Artificial Intelligence of Tsinghua University, once proposed that artificial intelligence applied in the industry needs to meet five conditions: rich data or knowledge, complete information, deterministic information, static environment, specific field or single task. As long as one of these five conditions is not met, it is very difficult for AI industrialization to land.
One of the ideas to change the dilemma is knowledge computing, so that AI systems can read knowledge and learn common sense reasoning, so that the model becomes trustworthy and highly reliable.
Previously, in order to improve the credibility and persuasiveness of search engine results, Google combined NLP with knowledge graphs to learn. If a searcher finds information in some articles that mention "XX has worked in China," which is fused with a knowledge base to show that XX once worked for the China Trade Commission and that the organization has an office in Beijing, the credibility of "XX has worked in China" will be greatly enhanced.
Similarly, if the autonomous driving system extracts and learns some travel common sense from large-scale text information, such as "large trucks block the line of sight in front, you should be careful, maybe suddenly a person may hit", the understanding of common sense knowledge will undoubtedly greatly increase people's confidence in the safety of automatic driving.
The model application phase requires knowledge-based computation.
At present, a major bottleneck in industrial intelligence is high cost computing power. Massive deep neural network systems require large amounts of computing resources to handle complex tasks. A study from the University of Massachusetts shows that several common large AI models emit more than 626,000 pounds of carbon dioxide during training, almost five times the life cycle emissions of the average car.
Pulling a step, human beings in thinking (is also a kind of knowledge calculation) is very energy-saving, psychologist Kahneman in the "Thinking, Fast and Slow" proposed that the human brain can either carry out slower rational thinking through System 2, or through System 1, based on the knowledge that has been internalized, to achieve unconscious, similar to muscle memory fast operations, brain energy consumption is very small.
In the future, creating a knowledge-based AI model, like activating brain regions, will become an important method of green computing to ensure the sustainable development of industrial intelligence.
It is not difficult to find that the combination of industry knowledge and AI computing is not only an inevitable stage of theoretical technology development, but also an indispensable step in the actual industrial AI.
As a kind of application technology, AI can only condense the long-term value of technology and promote the third wave of artificial intelligence to continue to rush forward only by truly accepting and integrating industry knowledge and transforming computing and knowledge into the productivity of the new era.
Tough Climbing: How Many Steps from the Data Layer to the Knowledge Layer?
Putting aside the application conditions and talking about the technical prospects are all "drawing cakes", and knowledge-based AI is also indispensable to the prerequisites. At least a few characteristics are required:
1. Accuracy of knowledge representation.
For AI to understand and use knowledge to solve complex real-world problems, it is first necessary to translate these contents into mathematical language, into a data path that AI can solve.
However, there are many types of knowledge that need to be represented in an AI system, and it is not easy to represent them comprehensively and accurately.
Among them, there is declarative knowledge that is easily characterized, procedural knowledge of how to do something; there is also knowledge that is not easy to describe, such as heuristic knowledge based on the experience of experts in a certain field, which is not necessarily all correct; and structural knowledge that represents conceptual relations, such as the interaction between molecules and molecules, is not yet fully understood by humans.
The accuracy of knowledge representation will directly affect whether machines can be as intelligent as humans.
2. Diversity of intellectual reasoning.
The ability to reason is the biggest difference between humans and other species, especially creative thinking. The core ability of knowledge computing is the ability to reason, which generates corresponding new knowledge according to the existing representation structure and provides creative insights for the industrial side.
It is entirely conceivable that such a scenario: to build a huge knowledge base, storing the knowledge needed by humans to complete various tasks, AI no longer needs to train each specific scene and specific data set, and can be like a real intelligent human being, touching the class bypass, giving examples and inverting each other, easily completing reasoning analysis and coping with various complex tasks in the real world.
3. Automation of knowledge acquisition.
Building a library of common sense is not an easy task, also known as the "Manhattan Project of AI." In particular, the massive data brought about by the information explosion needs to be taken over by the machine to transform information into knowledge, and to improve the efficiency of knowledge acquisition, automation has become a hard bone that must be gnawed.
Using automated methods to acquire new knowledge can accelerate the iteration of AI knowledge systems, realize automatic updates of models, and shorten the time to build industry knowledge graphs.
4. Efficient application of knowledge.
The knowledge precipitation, application and management methods of different industries are very different, and it is not realistic for enterprises to build a set of personalized tools on their own. Therefore, if knowledge computing wants to land in the industry, it also needs a series of standardized tools to provide knowledge search, high-performance query, visual analysis and other functions to improve the efficiency of knowledge mining.
As a newly emerging technology direction, it is necessary for forward-looking platform technology enterprises and organizations to do a good job in infrastructure construction, and open the capability interface to enterprises in all walks of life.
Data and information describe the world, and knowledge and wisdom understand the world. From this point of view, the higher the level of AI on the DIKW pyramid, the stronger the ability and the closer it is to strong artificial intelligence. This climbing road is not easy to follow, but it is the only way for AI industrialization and industrial AI.
Finally, when AI ascends to the tip of the pyramid and gains true wisdom, we can't be sure if AI will be the smartest object on Earth. Or is humanity still at the highest level of intelligence?
As Eliot writes in the poem: "Where have we lost the wisdom of knowledge?" Where did you lose the knowledge in your information? ”(Where is the wisdom we have lost in knowledge? / Where is the knowledge we have lost in information? )
Once upon a time, wisdom was something peculiar to human beings, a representative of man as the primate of all things. Many people are in the digital age, with less and less knowledge, active thinking, and more and more immersed in a sea of fragmented data and information.
Perhaps, as we witness AI climbing toward the tip of the pyramid, it is more important to be a little alert to the human slide towards the bottom of the pyramid.