The Future Chip Forum is a major annual academic conference hosted by the School of Integrated Circuits at Tsinghua University and the Beijing Advanced Innovation Center for Integrated Circuits, providing a cross-disciplinary and interdisciplinary platform for intellectual exchange among universities, research institutions, and enterprises in the field. Professor Wei Shaojun, Academician of the International Eurasian Academy of Sciences and Professor at Tsinghua University, shared some of his thoughts on future chips. What characteristics should an ultra-high-performance computing chip architecture possess? High-performance computing has entered the exascale era. Last year, the U.S. Department of Energy released the world's first exascale supercomputer. Exascale computing is a very important milestone, capable of performing 100 quadrillion double-precision floating-point operations per second. Following the release of this supercomputer, the Presidential Council of Advisors on Science immediately announced that the next goal for U.S. high-performance computing is z-scale computing. Compared to exascale computing, z-scale computing will be a thousand times faster.
The industry's pursuit of higher computing speeds is endless. Furthermore, the total amount of data is still increasing, reaching 100 ZB by 2024. This growth in data volume is significant, especially with the rise of artificial intelligence, making Z-level computing an unavoidable challenge for the industry. Wei Shaojun points out, "Simply relying on advancements in process technology is almost insufficient to achieve higher performance computing." The reason is simple: considering the processes used in cutting-edge US computers, such as the current 6nm process, power consumption reaches 21.1 megawatts, occupying 680 square meters. Semiconductor processes continue to advance, with 5nm, 4nm, and 3nm technologies emerging. If a 3nm process were used to achieve Z-level computing, the current power consumption would be as high as 8000 megawatts, meaning 8 million kilowatt-hours of electricity per hour, or approximately 4 million RMB per hour. In terms of cost, achieving Z-level computing using a 3nm process would cost $600 billion.
While advancements in manufacturing processes can reduce costs, the overall direct cost reduction is not significant, especially considering the footprint required (hundreds of thousands of square meters) and the resulting latency. Therefore, simply relying on technological improvements is insufficient to achieve higher computing performance. Current computing chips utilize computing resources only a small percentage (less than 0.1%) of the total chip resources, resulting in even lower utilization. However, their energy consumption for data transmission is extremely high; for example, GPUs consume over 90% of their energy. Given these fundamental characteristics, achieving next-generation computing based on current computing structures and chips is incredibly difficult. Furthermore, the demands on computing power have reached unprecedented levels. The rapid development of artificial intelligence is well-known. Currently, AI is divided into two categories: neuromorphic computing and deep learning. These two categories involve three fundamental elements: algorithms, data, and computing power.
Computing power has played a truly significant driving role in artificial intelligence. Current AI falls far short of expectations. On one hand, algorithms and human recognition differ, requiring AI to adapt to various applications. On the other hand, the current implementation of AI is rather brute-force. For example, a simple model from 2014 required approximately 19.6 billion operations per second and processing 138 million parameters simultaneously. Such high-density computing and storage pose significant challenges even to today's chips. Wei Shaojun stated, "Computing architecture has entered a 'golden age.' To this day, innovating computing architecture by simply following traditional methods is not feasible." Future supercomputing will require investments of less than 10 billion RMB, power consumption of less than 100 megawatts, and a footprint of tens of thousands or even 10,000 square meters. These conditions place new and potential demands on chips, hardware, and software.
What basic components are needed for artificial intelligence chips?
Intelligentization extends our cognition. The information revolution, primarily driven by technologies such as computers, networks, communications, optoelectronics, and integrated circuits, has extended and amplified human sensory capabilities. Information technology, together with artificial intelligence and new materials engineering, will advance to new heights, further extending and amplifying the capabilities of the human brain. The industrial revolution, characterized by mechanization, electrification, and automation, liberated human hands, provided enormous energy, and extended and amplified human physical abilities. Since the first electronic computer in 1946, we have witnessed three waves of intelligentization.
The first wave of artificial intelligence can be traced back to Japan's fifth-generation computer in 1990. By 2017, research had narrowed to machine learning algorithms for classification and recognition. Now, Google's DeepMind can outperform professional human gamers by 10 times. It's clear that artificial intelligence has surpassed human capabilities in many areas. Why is this? A Canadian neuroscientist has made significant contributions. Inspired by his work, neuromorphic computing and deep learning have been developed, allowing for the training of deep neural networks—a somewhat brute-force approach that yields impressive results.
The mainstream architecture of AI chips has evolved from AIChip 0.5 to AI Chip 1.7, and from cloud AI to edge AI. Wei Shaojun stated, "Computing power is a sufficient condition for the development of artificial intelligence, and computing power relies on chips, therefore chips are indispensable. Later, chips specifically designed for artificial intelligence emerged." One application is one algorithm; N applications require N chips. To address the issue of implementing different applications on a single chip, a reconfigurable approach emphasizing flexibility has emerged. In the process of handling different algorithms, improvements in computing power and versatility further enhance today's artificial intelligence.
Today, the industry's imagined structures focus more on computer architecture, potentially requiring the exploration of new technologies. However, this isn't about unrealistic fantasies. We can't create biological devices; we need to think more about silicon-based semiconductor materials capable of supporting massive inputs and outputs, potentially possessing basic weighted and activation function operations, employing in-memory computing, boasting ultra-low latency, ultra-low power consumption, and extremely low cost, and being manufactured using current CMOS processes. In the future, it could even achieve 3D integration. This is a crucial issue we must consider now. If we can break through these new AI technologies, it might open a new path. Does large-scale modeling have a decisive impact on chip development? Wei Shaojun is also pondering whether large-scale models are indispensable for chips, or whether they might have a negative impact. He conducted an experiment, repeatedly asking on Chat GPT, "Why did Lin Daiyu fight the White Bone Demon three times?" Chat GPT 4 and Chat GPT 3 gave completely different answers. Chat GPT 4 wasn't talking nonsense; it provided a more logical story.
However, ChatGPT's capabilities only began to grow after being trained on a large amount of data, meaning its inherent creative ability is quite limited. The reason people perceive ChatGPT as having many novel ideas is because it represents the collective wisdom of a group of people. In reality, ChatGPT lacks creativity; its success relies more on data training. In dialogue translation, its logical connections are often incorrect; in fact, ChatGPT is not as intelligent as it seems.
Looking back, are large models helpful for chip design? Many believe that the EDA industry is the best at utilizing large models for design. Wei Shaojun stated that there are two parts of the EDA industry that can utilize large models: EDA tools and design services. In other words, large models combined with certain tools produce results. Under normal circumstances, completing such a design would be extremely difficult. Therefore, regarding this question, Wei Shaojun stated that large models definitely help with chip design, but the extent of that help is worth considering.
What is the future direction of 3D integrated circuit technology?
Three-dimensional integrated circuits are gradually gaining popularity. Moore's Law continues to evolve, with increasing density. Currently, a 5nm process can integrate approximately 110 million transistors, or 2800 logic gates, on a single square millimeter. Our integration capabilities already exceed the achievable chip size.
Basic components are constantly evolving, from 45nm and 32nm to 5nm and 3nm, from High-K to FinFET, which can be used up to 7nm, and then to GAA at 3nm. GAA has a short lifespan, lasting only one or two generations, and whether it can continue to advance is unknown. Although two-dimensional devices and molecular devices have been proposed, their feasibility and cost sustainability remain questions. Chiplet and 3D packaging have also been proposed. In a sense, both of these methods represent broader integration, not the traditional integrated circuit integration on a single chip. Wei Shaojun stated that in the long run, this approach is sound and may even bring more significant advantages, such as cost reduction. It doesn't require using all the most advanced processes, shortening R&D and time-to-market. After reaching 3nm, it's also possible to consider vertically oriented transistors, creating a vertical growth pattern. 3D NAND has already achieved this, with Yangtze Memory Technologies Co., Ltd. (YMTC) utilizing stacking technology to achieve very high-performance 3D NAND.
Conversely, if a new fusion or integration can be formed, it will not only solve the problem of computing and storage, but also enable the development of three-dimensional integrated circuits. In short, computing is ubiquitous. High-performance computing is a strategic high ground for future development and a focal point of great power competition. Today's computing architecture and integrated circuit technology are insufficient to support high-performance computing reaching the Zetta level, urgently requiring breakthroughs through architectural innovation. Intelligentization is an inevitable trend, and the development of artificial intelligence depends on advancements in chip technology. Today's chip technology can no longer meet the demands of the rapid development of artificial intelligence technology, urgently requiring breakthroughs starting from basic components. The arrival of large-scale models has broadened people's thinking and provided some novel development directions. Many people have high expectations for large-scale models to assist in chip design. We cannot yet determine the specific help large-scale models provide for chip design, but we can make some judgments based on their basic principles, which will allow for a more objective and calm view of their role in integrated circuit chip design. The essence of integrated circuits lies in "integration." Against the backdrop of Moore's Law development, it is time to begin exploring the paths and methods of three-dimensional integration. This article is based on a speech by Professor Wei Shaojun of Tsinghua University at the 8th Future Chip Forum.