What first prompted people to realize the relationship between tacit knowledge and artificial intelligence was the application of learning-capable neural networks in expanding scientific experience, which unequivocally correlated tacit knowledge with a particular device. Perhaps facial recognition is the simplest example. Since ancient times, facial recognition has been (scientific) empirical knowledge, and explicitly so. Why is facial recognition considered explicit (scientific) empirical knowledge? Facial recognition essentially involves a subject seeing an image and being able to identify a person from a crowd (or memory) based on that image. Here, seeing the image is obtaining reliable information; it is a universally repeatable, controlled observation. Identifying the person from the crowd (or memory) is completing a selection (control), which is the subject's control over a controllable variable. Facial recognition, as scientific empirical knowledge, is essentially a definite connection between obtaining information and achieving some form of control. Usually, the subject is aware of the recognition process (how this connection is established), meaning it belongs to explicit knowledge within scientific empirical knowledge. When facial recognition can be done in other ways, such as using an artificial neural network with learning capabilities to recognize faces, the subject does not know how the neural network does it, and facial recognition becomes tacit knowledge.
Today, facial recognition devices are widely used in society. Facial recognition is an ability based on experience; when this ability is implemented by a machine, that machine possesses intelligence . We have seen that when a certain kind of tacit knowledge is acquired using a man-made device, that device necessarily corresponds to an intelligence that acquires that knowledge. In many people's minds, compared to the almost instinctive and simple ability of facial recognition, playing chess truly demonstrates the advanced intelligence possessed by the subject. However, if we analyze the process of playing chess, we will find that it is also about the subject acquiring (scientific) experiential knowledge, only the process of acquiring experiential knowledge is more complex! Because, like facial recognition, playing chess is also about the subject exercising certain control (the next move) based on the information acquired (the chessboard). If facial recognition is a one-time correspondence between information and control, then chess experience is a sequence of correspondences between the subject acquiring information and exercising control. Each time, the subject makes a choice based on the information of the chessboard, changing the chessboard, and then makes another choice based on the opponent's changed chessboard. This sequence of choices has its own clear goal, which is the final result determined by the outcome of each move in chess.
Since the knowledge of playing chess is essentially about how to make choices after receiving information each time, the choice dictated by each piece of information is a form of (scientific) experiential knowledge. Like facial recognition, this knowledge can be learned and trained through an artificial neural network. Because the choice sequence is lengthy, and two choices are interdependent, the outcome of a game depends on the final result of that sequence. All of this makes the structure and training method of artificial neural networks far more complex than that of facial recognition. However, no matter how complex the structure and training of an artificial neural network are, it is still, overall, a learning machine as shown in Figure 4-2. Therefore, after the learning machine composed of neural networks learns to play chess, how the subject plays chess also becomes implicit knowledge. Directly corresponding to this implicit knowledge is the ability to play chess. This explains why it is so important for a learning machine composed of neural networks to defeat the world Go champion; it symbolizes the emergence of artificial intelligence on the historical stage.
On March 9, 2016, a public match was held between the neural network learning machine AlphaGo and world Go champion, professional 9-dan player Lee Sedol. Anyone with experience playing Go knows how complex this experiential knowledge is, and mastering it undoubtedly requires a considerable level of intelligence. Did AlphaGo possess the intelligence to handle the ever-changing nature of Go games? People were skeptical. But the result of this match was that AlphaGo won 4-1, marking the first time in history that a world Go champion had lost to a neural network learning machine. This match not only signified that a neural network learning machine had defeated a human, but also that humans could not understand how AlphaGo played. The machine's way of playing Go was truly tacit knowledge. The relationship between tacit knowledge and intelligence was finally recognized by the world. Isn't this the truth? The concept of tacit knowledge was proposed by Wang Weijia to explain how neural network machines acquire experiential knowledge in a way completely different from humans. Since then, the scientific community has generally recognized that as long as the subject clearly defines what experiential knowledge they want to acquire, neural network learning machines can replace humans; that is, neural network learning machines can be applied to any field of acquiring scientific experiential knowledge.
Controversy in the proof of the Four Color Theorem for maps
As early as the 1940s, when the mathematical model of neural network learning machines was first proposed, it had already been proven to be a finite automaton, equivalent to the Turing machine. That is, it could be used to express learning through the interaction of electrical impulse input and output, as well as for computation and logical reasoning. Therefore, for another type of scientific knowledge—mathematical knowledge—if deriving a theorem from axioms requires some kind of computing device, it proves that there is also tacit knowledge involved. The Turing machine is a general-purpose electronic calculator. The invention of the general-purpose electronic calculator (von Neumann machine) predates neural network learning machines; therefore, before the discovery of tacit knowledge in scientific empirical knowledge, humanity already knew that tacit knowledge existed in mathematics. The proof of the four-color theorem for maps is a typical example.
The Four Color Theorem was proposed in 1852 by Francis Gusley, an amateur British mathematician. It states that any map can be drawn using only four colors, allowing countries sharing a common border to be colored differently. A map, or planar map, is defined by a set of axioms in graph theory. Therefore, proving the Four Color Theorem involves deriving it using these axioms. Proving this conjecture was extremely difficult. In 1878, British mathematician Alfred Kemp discovered that if a map requires at least five colors, then a regular five-color map must exist. He then proved that if there is a regular five-color map, there must also exist a minimal regular five-color map with the fewest countries; if a country in the minimal regular five-color map has fewer than six neighboring countries, then there must exist a regular five-color map with even fewer countries, thus preventing a five-color map with an extremely small number of countries from existing regular five-color maps, and consequently, preventing a regular five-color map from existing. Using reductio ad absurdum, Kemp believed he had proven the Four Color Theorem. However, several years later, British mathematician John Hearwood discovered a flaw in Kemp's proof.
Although Kemp's proof was refuted, two concepts he introduced provided a method for solving the Four Color Theorem. The first concept is configuration. Kemp proved that in every regular map, at least one country has two, three, four, or five neighboring countries, and there is no regular map where every country has six or more neighboring countries. The other concept is reducibility, that is, as long as one country in a five-color map has four or five neighboring countries, there will be a five-color map with a reduced number of countries. Since the introduction of the concepts of configuration and reducibility, mathematicians have gradually developed a standard method for checking whether a configuration is reducible. Researchers have found that proving the reducibility of configurations in solving the Four Color Theorem requires examining a large number of details. In 1950, German mathematician Heinrich Hesch pointed out through continuous experimentation that proving the Four Color Theorem using configuration reduction involves more than 10,000 configurations. Proving each of so many configurations individually would be an enormous undertaking, beyond human capability. For this reason, it was once thought that the Four Color Theorem for maps was unprovable.
On April Fool's Day in 1975, American mathematician and popular science writer Martin Gardner published a map in Scientific American, claiming that the Four Color Theorem was disproven because the map required at least five colors to complete. Of course, this was just a joke. However, no one expected that just a few months later, Wolfgang Hecken, a mathematician at the University of Illinois, improved Hecken's method. He collaborated with American mathematician Kenneth Appel to design a computer program, and with the participation of computer expert Kirk, finally proved the Four Color Theorem on January 6, 1976. They used exhaustive testing to examine 1482 configurations, proving one by one that they were all reducible, meaning none required five colors. This work involved performing 10 billion decisions on two IBM 360 calculators, running for over 1200 hours, with both computers yielding the same result. The computer proof of the Four Color Theorem caused a sensation, but also brought enormous controversy.
The configuration reduction program designed by Haken et al. is certainly correct, but the reduction and corresponding judgments were performed separately using two computers. The method to verify its reliability is to see if the results obtained by the two computers are consistent. This approach is somewhat similar to verification in engineering and physics. Here, proving the reliability of a computer's operation is exactly the same as determining the reliability of information obtained from observing an object with an electron microscope. American mathematician William Thurston once commented, "The standard of correctness and completeness of a working computer program is several orders of magnitude higher than the standard for reliable proofs in mathematics." However, no one has fully verified every step of the proof of the Four Color Theorem because it is impossible for humans to do so. For this reason, many mathematicians do not accept the above proof; Scottish mathematician Frank Posell pointed out that this proof does not belong to mathematics at all.
I believe the controversy surrounding the proof of the Four Color Theorem for maps can be addressed from two perspectives. First, some people are unaware that implicit knowledge also exists within mathematical knowledge. The proof of the Four Color Theorem falls under the domain of pure graph theory in abstract algebra. Graph theory theorems are defined by corresponding fundamental axioms. When these axioms are defined, all reliable information about the corresponding symbolic system is already given; proving the theorem simply involves identifying the information injected by these axioms. In principle, the process by which the subject obtains the reliable information injected by the axioms and the process by which the subject acquires information at each step can be separated. The existence of implicit knowledge in mathematical knowledge is undeniable. However, implicit knowledge constitutes a very small proportion of all mathematical knowledge possessed by humankind, and mathematicians are not yet accustomed to implicit knowledge within mathematical knowledge. However, I ask: why must there be a non-exhaustive proof different from the currently given proof of the Four Color Theorem? Mathematicians will eventually realize that they must accept the existence of implicit knowledge within mathematical knowledge.
Furthermore, mathematicians refuse to acknowledge the proof of the Four Color Theorem for another reason: they reject the idea that such a proof reflects human mathematical ability. Einstein famously compared pure mathematics to the poetry of logical thought, but the proof of the Four Color Theorem relies on a computer's exhaustive search of cases that humans cannot handle—like a phone book. Mathematicians find it difficult to imagine that such rapid exhaustive searching by computer, like looking up a phone book, constitutes mathematical intelligence. This is why research in this field is still referred to as machine proof, not artificial intelligence. However, any form of tacit knowledge corresponds to a kind of intelligence. Can we deny that the devices that help mathematicians acquire tacit knowledge are artificial intelligence? The answer is undoubtedly yes. Pure mathematics is a symbolic expression of infinitely expanding, universally repeatable, controlled experiments. When a subject studies the structure of a pure symbolic system, any ability to process symbolic systems is certainly intelligence. Therefore, in the field of mathematics, due to the use of computers, acquiring tacit knowledge through artificial intelligence predates that in the empirical realm; it's just that its scarcity has prevented it from being recognized.
The history of artificial intelligence
In fact, tracing the history of artificial intelligence reveals that it originated from the use of electronic computers to process symbolic structures. As early as 1956, dozens of scholars from mathematics, psychology, neuroscience, computer science, and electrical engineering gathered to discuss how to use computers to process symbolic structures. At this meeting, American computer scientist John McCarthy named this research artificial intelligence. Similarly, computer scientists Allen Newell and William Schwarz brought a program known as the "Logic Theorist" to the conference, and they were hailed as the "fathers of artificial intelligence" by the conference organizers. Looking at these studies called artificial intelligence, they all essentially involved inventing a device for processing information (symbolic systems or numbers) to organize and extract knowledge for solving problems. This is true for both the Logic Theorist and the general problem-solving system published in 1959 by Schwarz, Newell, and software engineer John Shaw. The difference between this device and the two types of devices mentioned earlier (neural networks with learning capabilities and computers for proving mathematical theorems) is that the former processes not purely scientific empirical knowledge, nor purely mathematical knowledge, but rather scientific theoretical knowledge. Because artificial intelligence was defined from the outset as how to use a device to solve practical problems by applying scientific theoretical knowledge, or to propose scientific theories through experience, it is widely known as an expert system.
For example, in 1964, American artificial intelligence scientist Edward Feigenbaum, molecular biologist Joshua Lederberg, and chemist Carl Djerassi used a device to process data collected from Mars to see if life could possibly exist there. Their collaboration resulted in the creation of the first expert system, Dendraal. Dendraal took mass spectrometry data as input and output the chemical structure of a given substance. Another example is the BACON system discovery program, which, from 1978 to 1983, was released by Herbert Schumacher, computer scientists Pat Langley, and Gary Bratshofer. This device rediscovered a series of well-known physical and chemical laws. Looking at various expert systems, they all involve creating a device to acquire information about scientific theories. Sometimes, expert systems use scientific theories to derive empirical knowledge to solve practical problems; other times, they propose scientific theories based on controlled experimental information or modify existing scientific theories. These tasks were originally done by humans, but expert systems are artificial devices used to replace human work.
I have a question: Does the reliable information obtained by these expert systems contain tacit knowledge? According to the definition of tacit knowledge, we know what the final reliable information is, but we don't know how that information was obtained. None of the known expert systems today possess this intelligence. Expert systems cannot do this because they cannot learn autonomously like artificial neural networks, nor can they make logical judgments without human intervention like machine proof. As mentioned earlier, the advancement of scientific theoretical knowledge relies on the interaction between scientific theoretical information and empirical information. This includes determining which of the two must be modified when the controlled experimental information derived from scientific theory does not contain the corresponding controlled experimental information from scientific experience—that is, how the two interact. Only when the interaction process forms a complete closed loop (without subjective participation) can it be fully realized through artificial devices. At this point, the subject obtains information about the scientific theory through the device, but does not know how that information was obtained. This is tacit knowledge in scientific theory.
Currently, such a device is under development, and ChatGPT, a large-scale pre-trained AI language model launched by the US AI research lab OpenAI, may be an example. Why? Scientific theory, as an archway spanning controlled experiments and the mathematical world, consists of an arch and a cover. The cover is the controlled experiments and controlled observations expressed in logical language, i.e., the scientific knowledge currently recorded in the literature. The arch is the relationship between the results of controlled experiments (observations) expressed quantitatively based on measurement; it is the law that serves as the foundation of various sciences. ChatGPT uses the grammatical analysis of natural language to transform natural language statements into logical language statements, and then automatically extracts the information contained in these logical language statements. When a person obtains new scientific theoretical knowledge through ChatGPT, but does not know how this new knowledge was obtained, the scientific theoretical knowledge satisfies the definition of tacit knowledge. For the arch, computer- controlled controlled experiments and controlled observations can also be established, where obtaining measurement data and achieving control do not require direct subject participation. In this case, the researcher also does not know how the new law was obtained. This is also tacit knowledge of scientific theory.
More importantly, the top cover is built upon the arch. By combining new information derived from statements with computer-generated experiments, we will find that in scientific theories, the growth of tacit knowledge will surpass that of explicit knowledge. In today's rapidly developing synthetic biology, the possibility of combining the two is brewing. This means that the third form of artificial intelligence (devices for acquiring tacit knowledge within scientific theories) may first be used in the life sciences, underpinned by a deep intersection of physics, chemistry, mathematics, information theory, and life science theories, forming enabling technologies for gene synthesis, gene editing, protein design, cell design, and experimental automation. The biofoundry is perhaps a typical example. However, in the construction and commissioning of the biofoundry, a large-scale facility, some Chinese researchers have used "creating things to acquire knowledge" as their slogan. This slogan follows the traditional Chinese concept of "investigating things to acquire knowledge." The builders of this large-scale facility may not have realized that as long as the facilities in synthetic biology achieve a closed loop for modifying scientific theoretical knowledge, tacit knowledge within scientific theories will be generated. This kind of "acquiring knowledge" is not entirely the same as the knowledge acquisition we are familiar with. Through the operation of the massive creation facility, synthetic organisms will be continuously produced, but this will not necessarily lead to an increase in the kind of theoretical knowledge of life sciences as we know them today, because a considerable part of it may be tacit knowledge.
The discovery of a wealth of tacit knowledge within scientific theories marks a revolution in the causal explanation of natural phenomena. We know that natural phenomena obey causal laws, but only a tiny fraction of those who understand causality can experience the process of comprehending it. Even so, this does not prevent humanity from using causality to transform the world, because even if most causal laws are tacit knowledge, we can still harness them through artificial intelligence.