amd unveils ai platform at advancing ai 2025 1 amd unveils ai platform at advancing ai 2025 1

AMD Unveils AI Infrastructures at Advancing AI 2025

At its 2025 Advancing AI event, AMD revealed its complete, integrated AI platform and an open, scalable rack-scale AI infrastructure built on industry standards.

The event showcased AMD and its partners providing an open standard AI Rack infrastructure. This includes a 128-GPU rack powered by AMD Instinct MI350 series GPUs, 5th Gen EPYC CPUs, and Pensando Pollara 400 NICs.

AMD also offered a sneak peek at “Helios,” a solution designed for demanding AI workloads. Helios will feature AMD Instinct MI400 GPUs, AMD EPYC “Venice” CPUs, and Pensando “Vulcano” AI NICs.

Current AMD AI Rack Infrastructure

AMD’s current offerings include 5th Gen AMD EPYC CPUs, AMD Instinct MI350 Series GPUs, and scale-out networking solutions like the AMD Pensando Pollara AI NIC. These components are integrated into an industry-standard OCP and Ultra Ethernet Consortium compliant design.

amd unveils ai platform at advancing ai 2025 3

These racks are built on OCP-compliant reference designs, supporting interoperability and integration with existing OCP compliant infrastructure.

AMD’s approach focuses on four principles:

  • Compute Performance: Instinct MI350 Series GPUs offer inference and training performance, with up to 36 TB of HBM3E in a 128-GPU rack and performance gains compared to some other products.
  • Enterprise CPUs: 5th Gen EPYC processors provide an x86 host processor for interface to existing enterprise applications.
  • Advanced Networking: AMD Pensando Pollara NICs are UEC-ready AI NICs that combine programmable transport, congestion control, and packet spraying for performance in AI clusters.
  • Open Standards and Design: Support for open standards such as the Ultra Ethernet Consortium and open designs from the Open Compute Project.

Combining these hardware components into a single rack solution enables AI infrastructure in both liquid and air-cooled configurations.

Next-Generation AMD AI Rack Infrastructure: Helios

Helios is designed to integrate compute engines and software. This platform combines AMD’s silicon, software, and systems expertise to provide an integrated AI rack platform with scale-up and scale-out capabilities for large-scale training and distributed inference.

amd unveils ai platform at advancing ai 2025 2

Helios is built to offer the compute density, memory bandwidth, performance, and scale-out bandwidth needed for demanding AI workloads, as a ready-to-deploy solution. It is designed for training models, running distributed inference, and fine-tuning enterprise models.

This solution integrates:

  • Next-Gen AMD Instinct MI400 Series GPUs: Expected to offer up to 432 GB of HBM4 memory, 40 petaflops of FP4 performance, and 300 gigabytes per second of scale-out bandwidth. These GPUs are intended for rack-scale AI performance in training and distributed inference.
  • Open Scale-Up with UALink: The “Helios” platform scales across 72 GPUs using UALink, an open standard for customer choice and interoperability in scale-up fabrics. In Helios, UALink interconnects GPUs and scale-out NICs, and is also tunneled over ethernet to interconnect GPUs, allowing communication as a unified system.
  • 6th Gen AMD EPYC “Venice” CPU: Powered by the “Zen 6” architecture, these CPUs are expected to offer up to 256 cores, performance improvements, and 1.6 TBs of memory bandwidth to help maintain performance across the “Helios” rack.
  • AMD Pensando “Vulcano” AI NICs: The next-generation NIC for AI scale-out is UEC 1.0 compliant and supports both PCIe and UALink interfaces for direct connectivity to CPUs and GPUs. It will also support 800G network throughput and improved scale-out bandwidth per GPU compared to the previous generation. “Vulcano” aims to support data transfer within high-density clusters by addressing communication bottlenecks for large-scale AI deployments.

Helios represents AMD’s focus on open standards, heterogeneous compute, and developer innovations for AI beyond the data center. More information will be shared in 2026.

Customer Adoption: Oracle Cloud Infrastructure

Oracle will be among the first to use the AMD Instinct MI355X-powered rack-scale solution. Oracle Cloud Infrastructure supports various enterprise workloads with requirements for scalability, reliability, security, and performance. Oracle’s deployment demonstrates the value of AMD solutions for enterprise-level generative and agentic AI applications.

Mahesh Thiagarajan, executive vice president, Oracle Cloud Infrastructure, stated that Oracle Cloud Infrastructure benefits from its collaboration with AMD. He noted that Oracle will be one of the first to provide the MI355X rack-scale infrastructure using AMD EPYC, Instinct, and Pensando products. He also mentioned that Oracle has seen customer adoption for AMD-powered bare metal instances, showing how customers can adopt and scale AI workloads with OCI AI infrastructure. Additionally, Oracle uses AMD technology internally and for customer-facing applications, and plans to continue engagement across multiple AMD product generations.

AMD Optimized AI Solutions

AMD has extended its solution engineering from the node to the rack and cluster level, addressing the scale needed for autonomous agents. AMD’s focus on open-standards, a partner ecosystem, and investment in capabilities supports solution integration.

The AMD open rack-scale AI solution aims to provide ready-to-deploy infrastructure. With “Helios,” AMD continues to develop AI infrastructure.

Leave a Reply

Your email address will not be published. Required fields are marked *