AMD has unveiled a series of accelerators and networking solutions designed to power the next generation of artificial intelligence (AI) infrastructure. Leading the charge is the new AMD Instinct MI325X accelerator, which promises to set a new industry standard for AI performance, particularly in data centers and generative AI models.
These innovations are bolstered by AMD’s partnership with leading tech companies such as ASUS, Dell Technologies, HPE, Lenovo, and Supermicro, who will integrate these accelerators into their AI solutions.
Enhanced AI Performance with AMD Instinct MI325X
The AMD Instinct MI325X accelerator, based on the AMD CDNA 3 architecture, delivers unprecedented performance for AI workloads, including training, fine-tuning, and inferencing of foundational models. It offers 256 GB of HBM3E memory, supporting a data bandwidth of 6.0 TB/s, significantly outperforming previous models like the H200 with 1.8 times more capacity and 1.3 times more bandwidth. This translates into up to 1.4 times the inference performance for specific large language models such as Mistral 7B and Llama 3.1.
The MI325X accelerators are expected to hit production in the fourth quarter of 2024, with broader system availability beginning in early 2025 through a variety of platform providers.
AMD’s Next-Gen AI Networking
In addition to the MI325X accelerators, AMD is pushing AI networking boundaries with two new products: the AMD Pensando Salina DPU and the AMD Pensando Pollara 400 NIC. These solutions target the distinct challenges of managing front-end and back-end AI network operations.
The Salina DPU is a third-generation data processing unit that promises 2x the performance of its predecessor, ensuring optimal data transfer rates for AI tasks with 400G throughput. Meanwhile, the Pollara 400, the industry’s first UEC (Ultra Ethernet Consortium) ready AI NIC, enhances accelerator-to-accelerator communication in data centers, increasing both performance and scalability.
AMD’s AI Software: ROCm and Generative AI Capabilities
AMD continues to refine its ROCm open software stack, ensuring its hardware integrates seamlessly with popular AI frameworks like PyTorch and Hugging Face. The latest ROCm 6.2 release offers key features such as FP8 datatype support and Flash Attention 3, enabling up to 2.4x performance gains for inference and 1.8x improvements in training on various large language models (LLMs).
These advancements are critical as AMD positions itself to compete in the high-performance AI infrastructure space, emphasizing both hardware and software innovations. With the introduction of the AMD Instinct MI350 series, slated for 2025, AMD promises even greater memory capacity and a significant 35x improvement in inference performance compared to its CDNA 3-based predecessors.