NVIDIA unveils AI SuperPOD, powered by high-efficiency Superchips
NVIDIA has introduced its cutting-edge AI supercomputer, the NVIDIA DGX SuperPOD, powered by NVIDIA GB200 Grace Blackwell Superchips to handle trillion-parameter models with constant uptime for generative AI training and inferencing workloads. The DGX SuperPOD incorporates a novel, highly efficient, liquid-cooled rack-scale architecture, built with NVIDIA DGX GB200 systems. Its capabilities reach 11.5 exaflops of AI supercomputing at FP4 precision and 240 terabytes of fast memory, with the potential for more with additional racks.
Each DGX GB200 system comprises 36 NVIDIA GB200 Superchips, with each one containing 36 NVIDIA Grace CPUs and 72 NVIDIA Blackwell GPUs, all linked via fifth-generation NVIDIA NVLink. Remarkably, compared to NVIDIA H100 Tensor Core GPU for large language model inference workloads, GB200 Superchips deliver up to a 30x performance increase. "NVIDIA DGX AI supercomputers are the factories of the AI industrial revolution," said Jensen Huang, founder and CEO of NVIDIA. "The new DGX SuperPOD… enables every company, industry and country to refine and generate their own AI."
The Grace Blackwell-powered DGX SuperPOD features eight DGX GB200 systems or more, capable of scaling to tens of thousands of GB200 Superchips, interconnected via NVIDIA Quantum InfiniBand to support next-generation AI models. In parallel, a larger shared memory space allows customers to deploy a configuration that connects the 576 Blackwell GPUs in eight DGX GB200 systems using NVLink.
Fourth-generation NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) technology provides a 4x increase from the previous generation to the next-generation DGX SuperPOD architecture, up to 14.4 teraflops of In-Network Computing. Simultaneously, a unified compute fabric forms a critical part of the new DGX SuperPOD with DGX GB200 systems, which not only includes fifth-generation NVIDIA NVLink but will also support NVIDIA Quantum-X800 InfiniBand networking features, giving up to 1,800 gigabytes per second of bandwidth to each GPU in the platform.
The DGX SuperPOD integrates with high-performance storage from NVIDIA-certified partners to meet the demands of generative AI workloads. Each SuperPOD is built, cabled, and tested in the factory to dramatically speed up deployment at customer data centres. Intelligent predictive-management capabilities continuously monitor thousands of data points across hardware and software. It can identify areas of concern and schedule necessary maintenance, flexibly adjust compute resources, and automatically save and resume jobs to prevent downtime, even in the absence of system administrators.
The unveiling of the NVIDIA DGX B200 system, a unified AI supercomputing platform for model training, fine-tuning, and inference, further advances NVIDIA's AI Supercomputing for industries. Customers can scale up to build DGX SuperPOD using DGX B200 systems to create AI Centers of Excellence powering large teams of developers running many different jobs. Equipped with eight NVIDIA B200 Tensor Core GPUs and two 5th Gen Intel Xeon processors, the B200 systems include advanced networking capabilities, delivering fast AI performance with NVIDIA Quantum-2 InfiniBand and NVIDIA Spectrum-X Ethernet networking platforms.
NVIDIA DGX platforms are inclusive of NVIDIA AI Enterprise software for enterprise-grade development and deployment. DGX customers can expedite their work with the pretrained NVIDIA foundation models, frameworks, toolkits, and new NVIDIA NIM microservices bundled in the software platform. Once systems are operational, DGX experts and select NVIDIA partners proceed to assist customers in enhancing their AI pipelines and infrastructure. NVIDIA forecasts that DGX SuperPOD with both DGX GB200 and DGX B200 systems will become available later this year through global partners.