Since the release of ChatGPT, AI has begun to gain wider recognition, and we have all truly felt its capabilities. In recent years, AI has been rapidly evolving in terms of model capabilities, becoming more effective at handling more complex language tasks; on the other hand, its application areas have been gradually expanding, helping professionals improve productivity in their respective vertical fields. Furthermore, the concept of "sovereign AI" has recently emerged, elevating it to the level of national security, and countries are beginning to pay more attention to their own AI infrastructure capabilities.
For my country, challenges and opportunities coexist. We have abundant application scenarios and user data, enabling us to be among the first to implement AI in practice; however, due to regional political frictions and chip bans, we are forced to build our own end-to-end AI ecosystem and possess the capability to develop underlying AI chips.
Recently, Gartner held a press conference to discuss the topic of "Chinese companies breaking through the limitations of artificial intelligence chips." Roger Sheng, Vice President of Research at Gartner, gave an insightful presentation.
With the advent of the AI era, the United States continues to intensify its restrictions on high-performance AI chips from China.
In order to maintain technological dominance in the upcoming AI era, the United States has gradually escalated its restrictions on China's AI field in recent years, from the initial blacklist and bans to the current comprehensive embargo and technology blockade, involving multiple levels, especially in the fields of high-performance computing and chip manufacturing.
[Chip Embargo and Restrictions]
The US government began imposing restrictions on Chinese high-tech companies several years ago, particularly export bans on chips. In 2020, the US introduced a ban on artificial intelligence chips, directly impacting Chinese companies' access to high-performance computing resources. To address these restrictions, NVIDIA launched the A800 and H800 GPUs specifically designed for the Chinese market. These products are performance-reduced versions of the original A100 and H100 to comply with export regulations. For example, the communication bandwidth of the A800 and H800 is limited to below 400 GB/s, but they still retain powerful computing capabilities to meet most AI and high-performance computing needs.
However, these measures were clearly insufficient to satisfy the US's restrictive intentions. In 2022, the US further lowered the threshold for restricting high-performance computing chips, setting the performance limit for which the ban applied to 300 TFLOPS (300 trillion floating-point operations per second), almost equivalent to the A100. In the second half of last year, the scope of the ban was tightened again, not only restricting overall computing performance but also focusing on performance density per unit chip area, with chips exceeding 370 GFLOPS per square millimeter also included in the restricted scope. These stringent restrictions have significantly hampered China's progress in supercomputing and AI training.
Blockade of manufacturing processes and supply chains
In addition to the embargo on the chips themselves, the United States has also imposed severe restrictions on chip manufacturing processes and supply chains. The manufacture of high-end chips relies on advanced processes and equipment, such as the 14nm and below technologies provided by companies like TSMC. These technologies are crucial for improving chip performance and reducing energy consumption. The US ban prohibits the export of internationally advanced manufacturing processes to China, posing a significant challenge to China in catching up with global chip manufacturing technology levels.
Meanwhile, the U.S. prohibits its citizens from engaging in advanced manufacturing equipment-related fields in China, directly impacting China's ability to attract top talent and technology. For example, a recent case shows that an executive at AMEC (Advanced Micro Devices) chose to sell some of his stock to pay U.S. departure taxes in order to focus on semiconductor equipment manufacturing in China. This demonstrates that even under immense economic and personal pressure, Chinese companies and individuals are still striving to advance the development of domestic semiconductor manufacturing technology.
"Overall, the United States is constantly 'plugging the loopholes,' while China is constantly looking for opportunities to use existing resources to wage 'guerrilla warfare.' Currently, we are in a situation of constantly 'plugging the loopholes' and 'fighting guerrilla warfare,'" Mr. Sheng shared.
To break through the blockade, Chinese companies and research institutions are actively seeking solutions, striving to develop independent innovative technologies while actively seeking international cooperation and alternatives. For Chinese chip manufacturers, even if the current alternatives are not entirely satisfactory, only with the firm support and commitment of the government and enterprises can they gain market experience and catch up through continuous technological iteration.
The current industrial transformation in China and the future development of edge inference are bringing new opportunities to Chinese AI chip manufacturers.
Domestic companies facing a blockade in building AI infrastructure may have three different options. The first is to choose alternatives from local suppliers. This is especially true for large domestic cloud service providers, government agencies, and state-owned enterprises, whose long-standing dependence on these suppliers is unavoidable. They must find domestic alternatives to ensure that basic public services and government operations are not affected. However, choosing alternatives from local suppliers presents challenges such as training efficiency, performance limitations, and an incomplete ecosystem. Migrating current operations smoothly to new domestic alternative platforms also involves a significant amount of additional work.
The second option is to choose NVIDIA's downgraded solution, which may be a suitable option for internet companies, multinational corporations, and small and medium-sized enterprises. The advantage is a mature platform, while the disadvantages are limited performance and a higher price.
The third option is to use NVIDIA chips through unofficial channels. This could include directly purchasing pre-built systems or leasing computing power. This method is only suitable for small and medium-sized companies, and its drawback is the lack of official guarantees and technical support.
In the long run, mastering the underlying AI chip capabilities and building a local GenAI ecosystem is an inevitable path for us. The long-term situation is forcing our users to choose domestic alternatives to build localized scenarios and applications; while for local chip suppliers, it is imperative to seize this opportunity, adapt to mainstream large-scale models from the chip level, and provide developers with an efficient development experience.
"Regardless of the reasons, we've reached this point now; it's an inevitable trend," President Sheng concluded.
Against this backdrop of major changes, there is also a significant opportunity driven by applications that Chinese chip manufacturers must seize. According to Gartner's predictions, starting in 2025, the demand for cloud inference chips will surpass the demand for training chips; and starting in 2026, as edge AI capabilities continue to drive the growth of GenAI applications, the demand for edge inference chips will also increase significantly.
The several service outages of OpenAI in 2024 demonstrate the increasing frequency and scale of generative AI usage. As more people adopt generative AI, the demand for inference and computing power in the cloud will inevitably increase rapidly. To improve energy efficiency and reduce cloud load, some inference computing power will migrate to edge devices, thus continuously boosting edge AI capabilities.
"The device side can support models with a scale of one billion to ten billion, while the edge side can support large models with a parameter scale of ten billion to one hundred billion. Both can actually support certain enterprise or individual applications. So from a technical point of view, it is indeed feasible," Sheng summarized. "Generative AI applications on the edge and device sides will continue to spread from smartphones and computers to consumer IoT, smart homes, and then further to automobiles."
Major chip manufacturers such as Qualcomm, MediaTek, and Intel are actively developing edge AI chips. How can Chinese AI chip manufacturers seize the GenAI opportunities at the device and edge levels? Mr. Sheng emphasized the importance of hardware standardization. To better utilize the potential of generative AI, Chinese chip manufacturers should consider collaborating to establish a unified standard, thereby promoting AI chip compatibility and the development of the software ecosystem. Such standardization will not only facilitate technological advancement but also simplify subsequent software development and ecosystem construction.
"Actually, looking at the past, our current situation is not that bad," Sheng said frankly. "China's overall model is not that bad either. We still need to learn from the past, from 'having nothing' to 'independent research and development,' and have this firm confidence in order to develop AI chips and the AI industry."