In fact, when it comes to memory in artificial intelligence , there are many new demands for memory, especially:
Larger capacities – Model sizes are enormous and growing rapidly, potentially reaching tens of terabytes. This scale of data necessitates ever-increasing DDR main memory capacity.
More bandwidth – As large amounts of data need to be moved, we are witnessing all DRAM types continuing to race to increase data rates to provide more memory bandwidth.
Lower latency – Another aspect of the need for speed is lower latency so that processor cores don’t sit idle while waiting for data.
Lower power consumption – We are pushing the limits of physics, and power consumption has become a significant limiting factor in artificial intelligence systems. The demand for higher data rates is also driving up power consumption. To mitigate this, I/O voltages are being reduced, but this reduces voltage margins and increases the chance of errors, necessitating higher reliability as well.
Greater reliability – To address the increasing error rates at higher speeds, lower voltages, and smaller processes, we are seeing more and more use of on-chip ECC and advanced signaling technologies for compensation.
Another important topic is the challenges and opportunities of new memory technologies in artificial intelligence. These new technologies offer many potential advantages, including:
Optimize capacity, bandwidth, latency, and power consumption for a key set of use cases. Artificial intelligence is a huge and important market with substantial funding, making it a great combination to drive the development of new memory technologies. In the past, GDDR (developed for the graphics market), LPDDR (developed for the mobile market), and HBM (developed for high-bandwidth applications such as AI) were created to meet the needs of use cases that existing memory could not satisfy.
CXL – CXL offers the opportunity to significantly expand memory capacity and increase bandwidth, while also abstracting memory types from the processor. In this way, CXL provides a good interface for integrating new memory technologies. The CXL memory controller provides a translation layer between the processor and memory, allowing new memory layers to be inserted after locally connected memory.
While new memory types tailored to specific use cases are beneficial to many applications, they face additional challenges:
DRAM , on-chip SRAM, and flash memory will continue to exist for the foreseeable future, so don't expect anything to completely replace these technologies. The annual R&D and capital expenditures on these technologies, coupled with decades of high-volume manufacturing experience, make it virtually impossible to replace any of them in the short term. Any new memory technology must work well with these existing memory technologies to be adopted.
The scale of AI deployments and the risks associated with developing new memory technologies make the adoption of entirely new memory technologies difficult. Memory development timelines typically run for 2-3 years, but AI is evolving so rapidly that it's difficult to predict specific functionalities that may be needed in the future. The risks are high, as is the reliance on new technologies being enabled and made available.
The performance advantage of any new technology must be high enough to offset any additional costs and risks. Given the demands on infrastructure engineering and deployment teams, this means that new memory technologies need to overcome a very high hurdle.
Memory will continue to be a key driver of future AI systems. Our industry must continue to innovate for future systems to deliver faster, more meaningful AI, and the industry is responding.