NVIDIA GB300 NVL72: 50x Faster AI with Blackwell Ultra GPUs

By Drakshi

March 28, 2025

0

83

Page Contents

The NVIDIA GB300 NVL72 designed for the era of artificial intelligence.

Overview

Developed for AI Reasoning Capabilities

With its fully liquid-cooled, rack-scale design, the NVIDIA GB300 NVL72 combines 36 Arm-based NVIDIA Grace CPUs and 72 NVIDIA Blackwell Ultra GPUs into a single platform that is optimized for test-time scaling inference. When compared to the NVIDIA Hopper platform, AI factories equipped with the GB300 NVL72 and NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet paired with ConnectX-8 SuperNICS offer a 50x higher output for reasoning model inference.

Performance

AI Factories Reaching Unprecedented Performance Levels

Take use of the NVIDIA GB300 NVL72 platform’s cutting-edge AI reasoning capabilities. The GB300 NVL72 offers a remarkable 5x increase in throughput (TPS per megawatt (MW)) and a 10x increase in user response (TPS per user) as compared to Hopper. When combined, these developments result in an astounding 50x increase in total AI factory output.

Features

AI Reasoning Inference

The computation required to attain maximum throughput and quality of service is increased by test-time scaling and AI reasoning. Compared to NVIDIA Blackwell GPUs, NVIDIA Blackwell Ultra’s Tensor Cores are enhanced with 1.5 times more AI compute floating-point operations per second (FLOPS) and 2 times the attention-layer acceleration.

288 GB of HBM3e

Maximum throughput performance and higher batch sizes are made possible by larger memory capacities. When combined with more AI computation, NVIDIA Blackwell Ultra GPUs provide 1.5x bigger HBM3e memory, increasing AI reasoning throughput for the longest context lengths.

NVIDIA Blackwell Architecture

The NVIDIA Blackwell architecture powers a new age of unmatched speed, efficiency, and scale by delivering ground-breaking advances in accelerated computing.

NVIDIA ConnectX-8 SuperNIC

Two ConnectX-8 devices are housed in the input/output (IO) module of the NVIDIA ConnectX-8 SuperNIC, which gives each GPU in the NVIDIA GB300 NVL72 800 gigabits per second (Gb/s) of network access. Peak AI workload efficiency is made possible by this, which offers best-in-class remote direct-memory access (RDMA) capabilities with either the NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet networking systems.

NVIDIA Grace CPU

A revolutionary processor made for the workloads of today’s data centres is the NVIDIA Grace CPU. It offers twice the energy efficiency of the top server processors of today along with exceptional performance and memory bandwidth.

Fifth-Generation NVIDIA NVLink

For accelerated computing to reach its full potential, all GPUs must communicate with one another seamlessly. AI reasoning models can achieve faster performance to the fifth-generation NVIDIA NVLink scale-up interconnect.

NVIDIA GB300 Grace Blackwell Ultra Superchip

Four NVIDIA Blackwell Ultra GPUs, two Grace CPUs, and four ConnectX-8 SuperNICs make up the NVIDIA GB300 Grace Blackwell Ultra Superchip, which serves as a foundation for the NVIDIA GB300 NVL72 rack-scale system. A massive GPU designed for the era of artificial intelligence reasoning is created by combining 18 superchips using NVIDIA NVLink Switch technology and NVIDIA BlueField-3 DPUs.

Release date

In the second half of 2025, partners are anticipated to offer the NVIDIA Blackwell Ultra devices, including the GB300 NVL72.

Pricing

The price range for each NVIDIA GB300 NVL72 AI server is between $3.7 million and $4 million.

Conclusion

The GB300 NVL72 is intended to support AI workloads for significant research institutes and hyperscalers. When using DeepSeek’s R1 model, it can process 1,000 tokens per second and combines 36 Grace CPUs with 72 Blackwell Ultra GPUs in a rack-scale configuration. It enhances the functionality of AI models and is intended for AI reasoning.

NVIDIA GB300 NVL72 Specs

Specification	Details
GPU Configuration	72 NVIDIA Blackwell Ultra GPUs
CPU Configuration	36 NVIDIA Grace CPUs
NVLink Bandwidth	130 TB/s
Fast Memory	Up to 40 TB
GPU Memory	Up to 21 TB
GPU Memory Bandwidth	Up to 576 TB/s
CPU Memory	Up to 18 TB SOCAMM with LPDDR5X
CPU Memory Bandwidth	Up to 14.3 TB/s
CPU Core Count	2,592 Arm Neoverse V2 cores
FP4 Tensor Core	1,400 PFLOPS
FP8/FP6 Tensor Core	720 PFLOPS
INT8 Tensor Core	23 PFLOPS
FP16/BF16 Tensor Core	360 PFLOPS
TF32 Tensor Core	180 PFLOPS
FP32	6 PFLOPS
FP64 / FP64 Tensor Core	100 TFLOPS

NVIDIA GB300 NVL72: 50x Faster AI with Blackwell Ultra GPUs

Overview

Developed for AI Reasoning Capabilities

Performance

AI Factories Reaching Unprecedented Performance Levels

Features

AI Reasoning Inference

288 GB of HBM3e

NVIDIA Blackwell Architecture

NVIDIA ConnectX-8 SuperNIC

NVIDIA Grace CPU

Fifth-Generation NVIDIA NVLink

NVIDIA GB300 Grace Blackwell Ultra Superchip

Release date

Pricing

Conclusion

NVIDIA GB300 NVL72 Specs

Internet of Things Devices Voice Recognition with Gemini API

OpenSHMEM 1.5 Implementation For Remote Memory Sharing

Embodied AI Robots And What is Embodied AI? & Its Importance

LEAVE A REPLY Cancel reply

Recent Posts

Internet of Things Devices Voice Recognition with Gemini API

OpenSearch Service AWS Gets Amazon Q Developer Support

IPv6 And IPv4 Dual Stack Now Available In Amazon API Gateway

OpenSHMEM 1.5 Implementation For Remote Memory Sharing

OCI Compute Shapes Unleash Cloud Efficiency with AMD EPYC

Embodied AI Robots And What is Embodied AI? & Its Importance

Popular Post

ASRock’s creative AMD FP6 series thin mini-ITX motherboard

ASUS ProArt PA602 The Most Elegant Computer Case!

Boost Your Apps Now: Amazon ElastiCache Serverless Unveiled!

What is Azure Policy in Microsoft Azure

The Ultimate Showdown: Redmi Watch 3 vs Redmi Watch 4!

Cardea Z540 SSD Revolutionizes Storage

About Us

POPULAR CATEGORY