AI-driven new infrastructure is leading a "three-wave revolution," driving industrial upgrading and innovation.

The new generation of AI infrastructure targets industrial users, adopting a "large center + node" model to build a computing power network covering the entire region. Through coordinated construction and operation, it promotes regional economic integration and intelligent development. Model as a Service (MaaS) is the core, providing efficient large-scale model services through cloud services, accelerating the deployment cycle of AI applications, reducing the cost for enterprises to apply large-scale model services, and promoting the deep integration of AI with various industries.

I. AI-Driven New Infrastructure Ushers in a "Three-Wave Transformation"

2023 will see the first wave of the "knowledge productivity revolution," centered on large language models to achieve a productivity transformation in knowledge engineering, similar to the printing revolution in the Middle Ages. Large language models possess ultra-high-speed learning capabilities, significantly improving the speed and accuracy of knowledge learning, searching, and dissemination in a human-machine collaborative model.

The second wave is the "software revolution," with intelligent programming assistants like SenseTime's "Code Raccoon" achieving over 50% efficiency improvements by covering the software development lifecycle. China has the second-largest number of programmers globally, making Chinese the best programming language. Large language models also support multi-software serialization and multi-model collaborative combinations, applied in areas such as AI agents, MoE architecture, and comprehensive intelligent customer service. A new generation of AI-native software applications is becoming widespread, and young people are growing up amidst emerging AI software and MaaS model-based innovative thinking.

Large language model intelligent programming assistant, empowering software development to improve efficiency and reduce costs

The third wave is the "AI computing revolution." With the continuous expansion of large language models, the demand for AI computing power is growing exponentially, posing a challenge to the linearly growing regional infrastructure. Faced with this contradiction, AI computing infrastructure is undergoing technological and engineering innovation to continuously reduce costs and improve efficiency. This makes AI a general-purpose infrastructure empowering various industries, while the "battle of a hundred models" is evolving into a specialized division of labor within the AI industry. According to an AI Now report, the demand for large model computing power doubles every 1-2 months, exhibiting exponential growth that surpasses traditional architectures. Because the "AI super demand curve" is leading the supply of AI computing power for traditional architectures, short-term market phenomena occur, such as AI chip production capacity bottlenecks and price increases. In the future, through large-scale investment in intelligent infrastructure resources and technological innovation, it is expected that core challenges such as large model training costs, GPU supply, and communication bottlenecks will be resolved within the next three years, thereby reducing the overall cost of AI computing and unleashing the innovative potential of intelligent applications generated by the general public.

Cost pressure of large model computing power

II. Large-scale models and generative AI are driving the arrival of the AI 2.0 era.

The AI 2.0 era is dominated by generative AI, no longer limited to pattern detection and rule following. Instead, it achieves a process similar to human creation through large-scale model training, thus realizing a fundamental shift from "classifier" to "generator." Predictions show that by 2027, generative AI will account for 42% of global AI spending, reaching $180 billion, with a compound annual growth rate of 169.7%. Large-scale models serve as the foundation for generative AI development, and the Chinese market has already released over 300 such models. Enterprises are increasingly recognizing the disruptive potential of generative AI. Gartner predicts that by 2026, over 80% of enterprises will use generative AI APIs or models, or deploy applications supporting generative AI in their production environments, a significant increase from less than 5% at the beginning of 2023. Generative AI is moving from hype to practical application, with enormous potential value creation. McKinsey predicts that generative AI is expected to create approximately $7 trillion in value for the global economy, increasing the overall economic benefits of AI by about 50%, with China expected to contribute approximately $2 trillion, nearly one-third of the global total.

Generative AI is driving the large-scale development of the AI market and bringing new economic benefits.

1. Generative AI drives industrial scaling, accelerating the realization of the vision of ubiquitous AI.

Generative AI is rapidly growing, with enterprises transitioning from sporadic, innovative applications to deploying it across all aspects of their business processes to enhance their competitive advantage. According to McKinsey research, one-third of companies report that their organizations frequently use generative AI applications in at least one business function. Enterprises are driving this by increasing investment in generative AI, with a shift in ICT investment towards generative AI to reap substantial benefits. According to IDC research, 24% of Chinese companies have already invested in generative AI, and 69% are screening potential application scenarios or beginning testing and proof-of-concept work. It is projected that by 2026, 40% of Chinese companies will have mastered the use of generative AI, jointly developing digital products and services and achieving twice the revenue growth compared to their competitors.

Respondents from different regions, industries, and experience levels indicated that they are already using generative AI.

Enterprises are adjusting their artificial intelligence (AI) strategies to accommodate the explosive growth of generative AI, integrating it across all business processes. In the AI 1.0 era, enterprises typically developed long-term plans with fragmented deployments. However, the generative AI of the AI 2.0 era is bringing rapid changes, with enterprise strategies emphasizing short-term goals, rapid action, and gradual coverage of key business operations. Key strategic changes include a rethinking of use cases, shifting from predictive analytics and automation applications to content generation and creation. As generative AI becomes an indispensable productivity tool, training employees to use these tools responsibly is also a focus.

In the AI 2.0 era, enterprises need to redefine their AI strategies.

By embracing generative AI, businesses can achieve collaborative innovation with their employees. Generative AI expands human expertise, creativity, and knowledge, improving productivity. Most importantly, it clarifies the possibilities for innovation, helping humans explore more solution possibilities in a shorter time and leverage more value at minimal cost. Gartner predicts that by 2026, more than 100 million people will be working with "robot colleagues (synthetic virtual colleagues)."

2. The AI industry chain is maturing and differentiating, with infrastructure becoming the foundation and guarantee for the development of the AI industry.

Enterprises are actively adopting large-scale models and generative AI to rapidly advance AI applications into deeper industrial sectors. Faced with diverse business needs and standards, the AI industry chain is maturing and differentiating rapidly, with an increasing number of upstream and downstream industry roles and links, requiring new infrastructure to better support this development. Key impacts include:

1) Intelligent computing power has become a core supporting element for the development of the AI industry.

Enterprises are increasingly favoring AI-ready data centers or GPU clusters for large-scale model training to shorten deployment time and reduce long-term investment costs. Intelligent computing power, suitable for large-scale model training, has become a major driver of computing power growth. According to IDC forecasts, China's intelligent computing power is projected to reach 1117.4 EFLOPS by 2027. During the period from 2022 to 2027, China's intelligent computing power is expected to grow at a CAGR of 33.9%, while the general computing power is projected to grow at a CAGR of 16.6% during the same period.

China's Intelligent Computing Power Scale and Forecast, 2020-2027, Based on FP16 Computing, EFLOPS

2) The paradigm shift in artificial intelligence production is towards a development path centered on large models.

In the AI 1.0 era, AI application development primarily relied on sophisticated and complex coding to express logic. However, as business scenarios evolved from general to fragmented, this model became expensive and faced accuracy challenges, hindering the development of the AI industry. In the AI 2.0 era, based on reinforcement learning and the combination of base models and human feedback, AI application development entered a stage of large-scale deployment. By fine-tuning base models to adapt to business logic, coupled with prompt word engineering, a wider range of business scenarios can be covered more quickly, cost-effectively, and with high accuracy, ushering in a new era of rapid development and ubiquity for the AI industry.

In the AI 2.0 era, the production paradigm of artificial intelligence has undergone a fundamental change.

3) As a new productivity tool, generative AI applications are entering a golden age of development.

With the rapid maturation of base models, generative AI applications have experienced explosive growth. Initially, text and image applications, represented by ChatGPT and Midjourney, quickly expanded their user base. Subsequently, applications such as audio generation, video generation, and multimodal generation, as well as tool applications (such as code generation, Copilot, digital humans, marketing tools, and chat assistants) targeting different industries and user groups, emerged one after another. In November 2023, OpenAI launched GPTs and planned to launch GPT Store, enabling users to create customized versions of applications without coding, combining their own instructions, external knowledge, and capabilities. This personalized development model and clear commercialization plan have expanded the dominance of generative AI applications from a few AI vendors to a large number of AI developers.

In the AI 2.0 era, the artificial intelligence industry is ushering in a more prosperous "Age of Exploration".

III. The AI 2.0 era has placed entirely new demands on AI infrastructure.

Entering the AI 2.0 era, traditional CPU-centric cloud computing infrastructure designed for mobile internet applications is no longer able to meet the challenges brought about by the explosion of large-scale model training and generative AI applications. These new challenges place entirely new demands on key aspects of AI infrastructure, including computing power, algorithm platforms, data, and the construction of engineering systems around these three aspects.

1. Traditional computing infrastructure cannot meet the new requirements of large models and generative AI.

The training of large models and generative AI applications have significantly increased the demand for GPUs or heterogeneous computing, which traditional CPU computing power can no longer meet. This not only places multifaceted demands on the computing efficiency and stability of GPU clusters, but also means that computing power is no longer a simple matter of stacking resources, but requires complex system engineering optimization. Simultaneously facing enormous investment pressure, achieving a balance between maintaining system stability and high efficiency has become a critical issue.

1) Explosive growth in demand for AI computing power, with GPUs at its core.

Taking OpenAI's GPT-3 as an example, training a model with 175 billion parameters requires approximately 3640 PFlops-day of computing power, using 1024 A100 GPUs for 34 days. As the number of model parameters continues to increase, the demand for training large models is showing a continuously rising trend. Over the past four years, the compound annual growth rate of large model parameters has been approximately 400%, corresponding to an increase in AI computing power demand exceeding 150,000 times, far exceeding Moore's Law. For example, GPT-4 has approximately 500 times the number of parameters as GPT-3, requiring about 20,000 to 30,000 A100 GPUs and about a month for training. In addition to large model training, the high-concurrency inference of generative AI applications will further drive up computing power demands, potentially far exceeding the computing power requirements of the training phase in the future.

The demand for AI computing power is growing exponentially to meet the needs of large-scale model development and practice.

2) High performance and high efficiency have become key to computing infrastructure.

To better support large-scale model training, using multi-machine, multi-GPU clusters for distributed training has become an indispensable choice. However, a large cluster is not equivalent to high computing power. In distributed training, efficiency degradation is a common challenge due to network communication or data caching issues. Especially in large models with hundreds of billions to trillions of parameters, communication time can account for up to 50%, and poor interconnection will affect the training efficiency of large models and restrict the further expansion of computing clusters. Therefore, the cluster must have high-speed interconnection network connections and highly reliable network infrastructure. In parallel training, network congestion can become a system bottleneck due to uneven load, affecting the information synchronization of dozens or even all GPU nodes. In addition, large model training typically uses checkpointing to save model parameters to achieve continuity. However, in traditional training methods, when the number of model parameters is large, checkpointing write time becomes long, reducing GPU utilization. Taking the GPT-3 model as an example, with a file system write speed of 15GB/s, a single checkpoint takes 2.5 minutes, resulting in corresponding resource waste. Therefore, the computing resources needed to support large model training not only need to be improved at the cluster hardware level, but also need to be optimized in conjunction with the software level.

The stability of large model training tasks decreases as the size of the training cluster increases.

3) Dedicated, large-scale, and long-term training places higher demands on the stability of GPU clusters.

Training large models on massive GPU clusters requires a significant amount of time. If a single node fails, the entire training process is interrupted, and the cause and location of the failure are difficult to determine quickly. For example, training 300 billion words on Meta's OPT-17B platform theoretically takes 33 days on 1,000 80G A100 GPUs, but in practice it took 90 days, during which 112 failures occurred, primarily hardware failures, resulting in 35 manual restarts and approximately 70 automatic restarts. Node failures not only extend training time but also waste computing resources. Therefore, ensuring the stability of cluster training is crucial, placing higher demands on cluster construction. This includes the cluster's ability to monitor failures in real time, resume training from where it left off, automatically isolate failed nodes, and quickly locate and recover from failures.

2. Data quality and efficiency determine the path to high-quality development of large-scale models.

The performance and value of large-scale models depend on high-quality data, but data acquisition, cleaning, and labeling face greater challenges, requiring more efficient AI data management processes to meet the new demands of the large-scale model era. Furthermore, large-scale model training and applications may involve user privacy and sensitive data, thus necessitating effective data governance measures to protect privacy and data security.

Building high-performance, value-aligned large-scale models is crucial, and data quality and efficiency are key factors. Due to significant differences in data quality from various sources, including duplicate, invalid, false, or sensitive data, these issues directly impact model performance and the value generated. To ensure data quality and value alignment, preprocessing steps such as cleaning and labeling of raw data are necessary. Traditional "workshop-style" data processing methods are no longer sufficient for the demands of the large-scale model era; therefore, it is essential to develop efficient "intelligent data processing pipelines" to compensate for the high cost and low efficiency of traditional methods.

With the adoption of generative AI, enterprises face increasingly prominent issues regarding user privacy and data security. Uploading enterprise code repositories or past marketing data may involve user privacy and core corporate secrets; improper protection could lead to serious data breaches, causing irreversible damage to users and the enterprise. How to efficiently govern data, isolate and protect uploaded data during large-scale model training and interaction has become a pressing issue. Data security is a crucial criterion for users when selecting AI software vendors.

3. Large-scale models require entirely new AI platform service models.

Large-scale model applications can help enterprises achieve their business goals more efficiently, but for most companies, developing large-scale models in-house is costly, and the model design, training, and optimization processes require a high level of expertise from developers. MaaS (Model as a Service) represents a new AI cloud service paradigm that uses large-scale models as a core component of AI infrastructure, providing them to developers and enterprises as cloud services for more efficient industrial development. Currently, companies including Microsoft, Huawei, Baidu, and SenseTime have launched MaaS services.

MaaS platforms help enterprises better utilize large model capabilities.

MaaS significantly accelerates the AI application development process and increases the speed of innovation iteration. This platform encapsulates pre-trained large models with development tools, data management, and other functions, enabling enterprises to quickly utilize AI capabilities without building large models from scratch. This shortens the time to launch new products, services, and business models, accelerates innovation iteration, and enhances the company's market competitiveness.

Furthermore, MaaS reduces enterprise costs and promotes the deep integration of AI with various industries. In the AI 1.0 era, the application of small models was limited and development costs were high, resulting in an AI penetration rate of only 4% in traditional industries. The era of large models, however, adopts a "basic large model + fine-tuning" approach, improving scenario applicability. At the same time, the MaaS model lowers the cost and professional barriers to AI development, encouraging enterprises to more actively promote AI innovation integrated with their business, driving the deep integration of AI with industries, and increasing the penetration rate of AI applications across industries.

This service model also fosters the establishment of a large-scale model ecosystem and promotes the large-scale deployment of large-scale model applications. MaaS is mainly provided by vendors with strong technical capabilities and abundant AI expert resources. Through the openness of the platform and the participation of the open-source community, it attracts more enterprises and developers to participate, forming a diversified large-scale model application development ecosystem to meet the AI needs of a wider range of more segmented scenarios, thereby promoting the realization of large-scale applications.

IV. Definition, Characteristics, and Value of Next-Generation AI Infrastructure

The AI 2.0 era necessitates a reimagining of infrastructure, with more refined design and restructuring to support the training and inference of large models and the large-scale deployment of generative AI applications. This next-generation AI infrastructure will be centered on large model capabilities, comprehensively integrating computing resources, data services, and cloud services, focusing on maximizing the performance of large models and generative AI applications. Its key elements include data preparation and management, large model training, inference, model capability invocation, and the deployment of generative AI applications. Enterprises can leverage this next-generation AI infrastructure to develop and run generative AI business and customer applications, while simultaneously training and fine-tuning base and industry models.

The next-generation AI infrastructure mainly consists of computing power, MaaS (Mobility as a Service), and related tools.

In practical applications, vendors provide consulting services related to large-scale model development practices to address the technical challenges users face when training and using large models. Regarding computing infrastructure, they offer comprehensive computing, storage, and other products and services for large-scale model training and inference, characterized by "high computing power, high collaboration, and strong scalability." This includes a computing foundation composed of high-performance heterogeneous clusters, a highly interconnected computing network, high-performance file storage, and large-scale AI computing resources, as well as powerful linear scalability, providing elastic and flexible cloud-native services.

The MaaS platform layer provides a complete service and toolchain system for large-scale model applications, including a basic large-scale model library, a large-scale model production platform, a data management platform, and application development. The MaaS platform layer can provide pre-built basic large-scale models and APIs, one-stop large-scale model development tools and services, AI-native application development tools, and pre-built high-quality datasets and AI data management services to meet users' needs in different business scenarios. This helps reduce customer usage costs and accelerates the rapid deployment of large-scale models across various industries.

1. Key Features of Next-Generation AI Infrastructure

1) The next-generation AI infrastructure is not simply the AI-enabled version of traditional cloud computing; the two have distinct positioning and development paths. The next-generation AI infrastructure is primarily geared towards industrial users, providing an AI foundation for the research and training of ultra-large models, regional industry and application incubation and innovation. It will also radiate outwards along with the industrial regions, driving the intelligent development of the regional economy through sustainable operation.

The next generation of AI infrastructure faces business requirements that differ from those of traditional cloud computing.

Intelligent computing centers, built and operated in an integrated manner, fully leverage the benefits of infrastructure. They serve not only as physical carriers of AI infrastructure but also as comprehensive service platforms integrating public computing power services, data sharing, intelligent ecosystem development, and industrial innovation aggregation. The construction of intelligent computing centers should not only focus on planning guided by the industrial ecosystem but also emphasize support for regional industries, scientific research, and other application scenarios. Choosing a reasonable construction and operation model and achieving sustainable operation after completion will help the local area better utilize computing resources, promote the development of the intelligent industrial ecosystem, and cultivate AI talent.

In terms of AI computing power network deployment, a "large center + node" model is adopted to build a large-scale AI computing power network that is complementary and collaboratively scheduled across regions. The "large center" deploys low-cost, large-scale computing power clusters to meet the needs of training and deploying trillion-parameter models; simultaneously, computing power nodes are deployed in regions with a strong industrial base to meet the integrated computing power needs of industrial training and inference. Through the coordinated expansion of node deployment and the large center, cross-regional collaborative scheduling of training and inference computing power is achieved.

2. Next-generation AI infrastructure creates social value

The new generation of AI infrastructure lowers the barriers to developing and applying large-scale models, creating greater social value in areas such as government and enterprise services, industry, and scientific research innovation. Specifically, this includes three aspects:

Next-generation AI infrastructure empowers government, industry, and scientific research innovation.

1) Enhancement of government intelligence

"One-Stop Government Services" injects large-scale model capabilities into government services. By integrating fragmented government applications and using a powerful, unified platform, "one-stop government services" are achieved to enhance local government governance capabilities. This promotes the efficient implementation of various intelligent services benefiting businesses and citizens, making it easier for businesses and citizens to enjoy urban public services. When processing massive amounts of government data, the large-scale government model can quickly identify hot topics and analyze policy implementation, providing support for policy formulation and implementation, thereby improving social governance. Furthermore, a unified public consultation window, through the large-scale government model, can accurately and quickly identify citizens' needs, improving the efficiency of government services.

2) Stimulating industrial innovation

The "Big Data Model + MaaS" model facilitates regional intelligent transformation. It leverages big data models to stimulate regional industrial innovation and accelerate the intelligent transformation of traditional industries. For example, in agriculture, remote sensing agricultural big data models are combined to upgrade and promote agricultural technologies. Simultaneously, AI infrastructure empowers the research and application of industrial big data models, enabling large-scale production of industrial AI.

3) Empowering scientific research

The new paradigm of "AI for Science" is driving scientific progress. Large-scale modeling technology has brought significant breakthroughs to scientific research. For example, in the field of biological computing, AlphaFold2 covers 98.5% of the human proteome, and the global medium-range weather forecasting model "WindWu" has for the first time achieved effective forecasts of core atmospheric variables for more than 10 days at high resolution. Large-scale models predict and simulate atomic motion, medical images, and astronomical images, accelerating the automation and intelligence of scientific experiments and propelling the new paradigm of "AI for Science" to achieve even greater breakthroughs.

AI-driven new infrastructure is leading a "three-wave revolution," driving industrial upgrading and innovation.

Read next

CATDOLL 115CM Nanako TPE (Customer Photos 2)

CATDOLL 146CM Vivian TPE

CATDOLL 166CM An TPE

CATDOLL 123CM Olivia TPE