The recent National Committee of the Chinese People's Political Consultative Conference (CPPCC) also addressed this issue—"Artificial intelligence has become a crucial area of competition in science and technology among nations. We must delve deeper into the computing power potential of domestically produced AI chips, accelerate the development of domestically produced operating systems, solidify the computing power foundation for the development of artificial intelligence, and help new productive forces accelerate their growth." Therefore, let's discuss AI chips.
As an integrated circuit designed and manufactured specifically for AI computing needs, AI chips have not only revolutionized the way computers process information, but have also played a crucial role in many cutting-edge fields such as image recognition, speech recognition, natural language processing, and autonomous driving.
Basic Concepts of AI Chips
AI chips, also known as AI accelerators or intelligent chips, are specialized microprocessors designed specifically for efficiently running artificial intelligence algorithms. Unlike traditional general-purpose processors such as CPUs and GPUs, AI chips focus on solving large-scale parallel computing problems in AI applications, especially intensive mathematical operations for neural network models, such as matrix multiplication, convolution operations, and activation function calculations. This highly customized design significantly improves computational efficiency, reduces energy consumption, and enables real-time response and high-performance inference capabilities.
The technical principles and architecture of AI chips
The core principle of AI chips based on artificial neural network models is based on artificial neural networks, where the processing units inside the chip simulate the working mechanism of biological neurons. Each processing unit can independently perform complex mathematical operations, such as multiplying the input signal by weights and accumulating the results to form the activated output of the neuron. The activation function determines how the signal is transformed into a meaningful result, and it is an indispensable part of the AI chip.
AI chip hardware architectures vary widely and can be categorized as follows based on their design goals and application scenarios:
GPU (Graphics Processing Unit): Originally used primarily for graphics rendering, GPUs have become widely used for training large-scale deep learning models due to their strong parallel computing capabilities, especially adept at handling floating-point intensive computing tasks.
FPGA (Field Programmable Gate Array): FPGA has highly flexible programmability and can be quickly reconfigured at the hardware level to adapt to different AI algorithms, making it suitable for early development stages and dynamic workload scenarios.
ASIC (Application-Specific Integrated Circuit): ASIC is a chip customized for specific AI tasks. Compared to GPUs and FPGAs, it has higher computational efficiency and lower energy consumption in specific applications, but lacks versatility.
TPU (Tensor Processing Unit): Google's TPU is an ASIC instance specifically designed for machine learning tasks, focusing on efficient matrix operations, and is especially suitable for deep learning models under the TensorFlow framework.
Classification and Market Applications of AI Chips
AI chips are widely used in various fields, including but not limited to:
1. Autonomous driving : AI chips can process data collected by vehicle sensors in real time, enabling accurate navigation and decision-making, and improving the safety and reliability of autonomous driving.
2. Intelligent Security : AI chips can be used in security fields such as video surveillance and facial recognition to improve the efficiency and accuracy of security monitoring.
3. Smart Home : AI chips can support the intelligent control and management of smart home devices, enhancing the living experience.
4. Healthcare : AI chips can be used in fields such as medical image analysis and disease diagnosis to assist doctors in providing precise treatment.
Current Status and Future Challenges of AI Chips in China
The domestic AI chip market has developed rapidly in recent years, giving rise to a number of innovative and competitive companies, including well-known ones such as Huawei, Cambricon, Horizon Robotics, and Baidu, as well as international companies like Nvidia. Below is a brief introduction to one chip from each of these companies:
Huawei HiSilicon's Ascend 910
Da Vinci Architecture
Architecture: Based on the Da Vinci architecture design
Manufacturing process: 7nm
Number of cores: Equipped with a large number of AICores (artificial intelligence cores), such as the 256 AICores mentioned above.
Performance metrics: Half-precision (FP16) computing power: up to 256 TeraFLOPS (trillion floating-point operations per second)
Integer precision (INT8) computing power: up to 512 TeraOPS (trillion integer operations per second)
Supports high-speed memory interfaces and channels, such as 128-channel full HD video encoding and decoding capabilities.
Maximum power consumption: approximately 350 watts
Cambrian Siyuan 370
MLU architecture
Architecture: MLUarch03
Computing power: Up to 256 TOPS (INT8), 64 TOPS (FP16)
Manufacturing process: 7nm
Performance metrics: Maximum computing power up to 256 TOPS (INT8 precision)
Number of integrated transistors: 39 billion
Memory support: Supports LPDDR5 memory
Application scenario: Suitable for cloud computing data centers
Maximum power consumption: 250W
Horizon 5
Horizon Architecture
Journey 5:
Architecture: Dual-core BPU: Horizon Robotics' self-developed second-generation Bayesian architecture, optimized for AI computing.
Computing power: The AI computing power of a single chip can reach up to 128 TOPS, which can handle a large number of parallel computing tasks.
Power consumption: 30W
Process: 16nm
Application scenarios: In-vehicle AI in autonomous driving, smart cockpits, and intelligent monitoring.
Baidu Kunlun Chip
Kunlun Architecture
Architecture: Baidu Kunlun 2 chip adopts the self-developed second-generation XPU architecture, which is an architecture design that has been deeply optimized for AI computing. It can efficiently execute large-scale parallel computing tasks and is particularly suitable for processing deep learning and machine learning algorithms.
Computing power: INT8 integer precision computing power reaches 256 TeraOPS (trillion integer operations per second).
The half-precision (FP16) computing power is 128 TeraFLOPS (trillion floating-point operations per second).
Power consumption: Maximum 120W
Process technology: 7nm.
Application scenarios: Baidu Kunlun 2 chip is suitable for AI computing needs in multiple scenarios such as cloud, edge, and device.
Nvidia H100
Nvidia H100SM
Architecture: Hopper architecture
Computing power: 67 TFLOPS for FP64;
FP32 has 989 TFLOPS;
FP16 has 1979 TFLOPS;
BF16 has 1979 TFLOPS;
INT8 has 3958 TFLOPS
Power consumption: 700W
Process: 4nm
Application scenarios: Machine learning, deep learning training and inference, scientific computing simulation, data analysis, natural language processing, etc.
It is evident that while domestic AI chips have achieved certain successes in design and application, a performance gap still exists compared to leading international companies like Nvidia. Domestic AI chips also face a series of key challenges:
1. Technological barriers and core patents : Chinese companies lag behind international leaders in high-end chip design, EDA tools, IP cores and advanced manufacturing processes, especially in advanced processes of 7nm and below, where they are highly dependent on foreign advanced technologies and equipment and also face the risk of sanctions.
2. Market competition and brand awareness: Although Huawei and other manufacturers have a significant influence in the domestic market, Nvidia, Intel, AMD and other companies still dominate the AI chip field in the international market. It will take time for Chinese companies to build brand influence and customer trust globally.
3. Talent Reserves and Training : The research and design of high-end AI chips requires a large number of professionals, covering a wide range of technical fields, including integrated circuit design, algorithm optimization, and materials science. China needs to further strengthen its talent training and recruitment efforts to support the long-term development of the industry.
With the continuous efforts and innovation of domestic enterprises, this gap is expected to gradually narrow in the future. At the same time, the government should increase its support for the AI chip industry to promote its rapid development in China.