Tuesday, April 1, 2025

NVIDIA GB300 NVL72: 50x Faster AI with Blackwell Ultra GPUs

The NVIDIA GB300 NVL72 designed for the era of artificial intelligence.

Overview

Developed for AI Reasoning Capabilities

With its fully liquid-cooled, rack-scale design, the NVIDIA GB300 NVL72 combines 36 Arm-based NVIDIA Grace CPUs and 72 NVIDIA Blackwell Ultra GPUs into a single platform that is optimized for test-time scaling inference. When compared to the NVIDIA Hopper platform, AI factories equipped with the GB300 NVL72 and NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet paired with ConnectX-8 SuperNICS offer a 50x higher output for reasoning model inference.

Performance

AI Factories Reaching Unprecedented Performance Levels

Take use of the NVIDIA GB300 NVL72 platform’s cutting-edge AI reasoning capabilities. The GB300 NVL72 offers a remarkable 5x increase in throughput (TPS per megawatt (MW)) and a 10x increase in user response (TPS per user) as compared to Hopper. When combined, these developments result in an astounding 50x increase in total AI factory output.

Features

AI Reasoning Inference

The computation required to attain maximum throughput and quality of service is increased by test-time scaling and AI reasoning. Compared to NVIDIA Blackwell GPUs, NVIDIA Blackwell Ultra’s Tensor Cores are enhanced with 1.5 times more AI compute floating-point operations per second (FLOPS) and 2 times the attention-layer acceleration.

288 GB of HBM3e

Maximum throughput performance and higher batch sizes are made possible by larger memory capacities. When combined with more AI computation, NVIDIA Blackwell Ultra GPUs provide 1.5x bigger HBM3e memory, increasing AI reasoning throughput for the longest context lengths.

NVIDIA Blackwell Architecture

The NVIDIA Blackwell architecture powers a new age of unmatched speed, efficiency, and scale by delivering ground-breaking advances in accelerated computing.

NVIDIA ConnectX-8 SuperNIC

Two ConnectX-8 devices are housed in the input/output (IO) module of the NVIDIA ConnectX-8 SuperNIC, which gives each GPU in the NVIDIA GB300 NVL72 800 gigabits per second (Gb/s) of network access. Peak AI workload efficiency is made possible by this, which offers best-in-class remote direct-memory access (RDMA) capabilities with either the NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet networking systems.

NVIDIA Grace CPU

A revolutionary processor made for the workloads of today’s data centres is the NVIDIA Grace CPU. It offers twice the energy efficiency of the top server processors of today along with exceptional performance and memory bandwidth.

Fifth-Generation NVIDIA NVLink

For accelerated computing to reach its full potential, all GPUs must communicate with one another seamlessly. AI reasoning models can achieve faster performance to the fifth-generation NVIDIA NVLink scale-up interconnect.

NVIDIA GB300 Grace Blackwell Ultra Superchip

Four NVIDIA Blackwell Ultra GPUs, two Grace CPUs, and four ConnectX-8 SuperNICs make up the NVIDIA GB300 Grace Blackwell Ultra Superchip, which serves as a foundation for the NVIDIA GB300 NVL72 rack-scale system. A massive GPU designed for the era of artificial intelligence reasoning is created by combining 18 superchips using NVIDIA NVLink Switch technology and NVIDIA BlueField-3 DPUs.

Release date

In the second half of 2025, partners are anticipated to offer the NVIDIA Blackwell Ultra devices, including the GB300 NVL72.

Pricing

The price range for each NVIDIA GB300 NVL72 AI server is between $3.7 million and $4 million.

Conclusion

The GB300 NVL72 is intended to support AI workloads for significant research institutes and hyperscalers. When using DeepSeek’s R1 model, it can process 1,000 tokens per second and combines 36 Grace CPUs with 72 Blackwell Ultra GPUs in a rack-scale configuration. It enhances the functionality of AI models and is intended for AI reasoning.

NVIDIA GB300 NVL72 Specs

SpecificationDetails
GPU Configuration72 NVIDIA Blackwell Ultra GPUs
CPU Configuration36 NVIDIA Grace CPUs
NVLink Bandwidth130 TB/s
Fast MemoryUp to 40 TB
GPU MemoryUp to 21 TB
GPU Memory BandwidthUp to 576 TB/s
CPU MemoryUp to 18 TB SOCAMM with LPDDR5X
CPU Memory BandwidthUp to 14.3 TB/s
CPU Core Count2,592 Arm Neoverse V2 cores
FP4 Tensor Core1,400 PFLOPS
FP8/FP6 Tensor Core720 PFLOPS
INT8 Tensor Core23 PFLOPS
FP16/BF16 Tensor Core360 PFLOPS
TF32 Tensor Core180 PFLOPS
FP326 PFLOPS
FP64 / FP64 Tensor Core100 TFLOPS
Drakshi
Drakshi
Since June 2023, Drakshi has been writing articles of Artificial Intelligence for govindhtech. She was a postgraduate in business administration. She was an enthusiast of Artificial Intelligence.
RELATED ARTICLES

Recent Posts

Popular Post