GB200 NVL72: Tool for Next-Gen AI and Scientific Discovery

June 3, 2024

89

Page Contents

NVIDIA GB200 NVL72

In a rack-scale configuration, GB200 NVL72 links 36 Grace CPUs with 72 Blackwell GPUs. With a 72-GPU NVLink domain that functions as a single huge GPU and provides 30X quicker real-time trillion-parameter LLM inference, the GB200 NVL72 is a liquid-cooled, rack-scale solution.

A crucial part of the NVIDIA GB200 NVL72 is the GB200 Grace Blackwell Superchip, which uses the NVIDIA NVLink-C2C interconnect to link two powerful NVIDIA Blackwell Tensor Core GPUs and an NVIDIA Grace CPU.

Instantaneous LLM Deduction

Together with fifth-generation NVIDIA NVLink, the GB200 NVL72 offers state-of-the-art features including a second-generation Transformer Engine that powers FP4 AI and provides 30X faster real-time LLM inference performance for trillion-parameter language models. A new generation of Tensor Cores, which bring new microscaling formats and offer high accuracy and higher throughput, makes this progress possible. Furthermore, the GB200 NVL72 overcomes communication obstacles by combining liquid cooling and NVLink to build a single, enormous 72-GPU rack.

Large-Scale Instruction

A faster second-generation Transformer Engine with FP8 precision is included in the GB200 NVL72, allowing for an amazing 4X faster training time for big language models at scale. The fifth-generation NVLink, which offers NVIDIA Magnum IOTM software, InfiniBand networking, and 1.8 terabytes per second (TB/s) of GPU-to-GPU connectivity, complements this innovation.

Infrastructure with Low Energy Use

Data centres with liquid-cooled GB200 NVL72 racks use less energy and have a smaller carbon footprint. Large NVLink domain architectures benefit from liquid cooling’s ability to boost computation density, minimise floor space consumption, and enable high-bandwidth, low-latency GPU connection. When compared to NVIDIA H100 air-cooled infrastructure, the GB200 uses less water and offers 25X higher performance at the same power.

Data Entry

For businesses, databases are essential for managing, processing, and evaluating massive amounts of data. GB200 leverages the NVIDIA Blackwell architecture’s high-bandwidth memory performance, NVLink-C2C, and dedicated decompression engines to expedite critical database queries by 18X as compared to CPU and provide a 5X better total cost of ownership.

Although there are many advantages, there can be significant computational and resource costs associated with training and deploying big models. Widespread implementation will depend on computationally, financially, and energy-efficient systems that are designed to provide real-time inference. One such system that is capable of the job is the new NVIDIA GB200 NVL72.

Let’s look at the Mixture of Experts (MoE) models as an example. By utilising pipeline and model parallelism, these models facilitate the training of thousands of GPUs while distributing the computational load across numerous specialists. increasing the effectiveness of the system.

But GPU clusters may be able to make the technical issue manageable thanks to a new level of parallel processing, high-speed memory, and high-performance connections. This is accomplished by the NVIDIA GB200 NVL72 rack-scale architecture, which NVIDIA describes in more detail in the post that follows.

GB200 NVIDIA NVL36 and NVL72

In NVLink domains, the GB200 supports 36 and 72 GPUs. Based on the NVLink Switch System and the MGX reference design, each rack houses 18 computing nodes. With 18 solitary GB200 compute nodes and 36 GPUs in a single rack, it is available in the GB200 NVL36 configuration. With 72 GPUs in one rack and 18 dual GB200 compute nodes, or 72 GPUs in two racks with 18 single GB200 compute nodes, is how the GB200 NVL72 is arranged.

For ease of use, the GB200 NVL72 tightly packs and links the GPUs using a copper cable cartridge. Additionally, it features a liquid cooling system design, which results in 25 times less energy and cost usage.

NVIDIA GB200 NVL72 Features

Architecture by Blackwell

With unmatched speed, efficiency, and scalability, the NVIDIA Blackwell architecture ushers in a new era of computing with revolutionary advances in accelerated computing.

NVIDIA Grace Processor

An innovative processor for AI, cloud, and HPC applications running in contemporary data centres is the NVIDIA Grace CPU. It offers exceptional speed and memory bandwidth at a 2X energy efficiency compared to the top server processors available today.

Fifth-Stage NVIDIA NVLink Technology

For exascale computing and trillion-parameter AI models to reach their full potential, quick, smooth communication between each GPU in a server cluster is necessary. A scale-up link, the fifth iteration of NVLink unlocks faster performance for trillion- and multi-trillion-parameter AI models.

Graphics Processing Unit

As the foundation for distributed AI model training and generative AI performance, the data center’s network is essential to the development and performance of AI. For the best possible application performance, NVIDIA Quantum-X800 InfiniBand, NVIDIA Spectrum-X800 Ethernet, and NVIDIA BlueField-3 DPUs provide effective scalability over hundreds or thousands of Blackwell GPUs.

Nvidia GB200 NVL72 price

The NVIDIA DGX GB200 NVL72 is priced accordingly, being a high-end device aimed at academic institutions and major enterprises. One estimate places the price of a fully loaded system with 72 GB200 Superchips at approximately $3 million USD.

Here’s the reason it costs so much:

Abundant processing power: Equipped with 72 GPUs, it achieves remarkable performance in the 1.44 exaFLOPs of FP4 exaflop range.

Advanced hardware: It includes a liquid cooling system, a unique NVLink switch system for high-speed networking, and 13.5 TB of HBM3e memory.

It is challenging to find a publicly published price for the DGX GB200 NVL72 due to its specialised market. The $3 million estimate, nevertheless, is a reasonable approximation.

NVIDIA GB200 NVL72 Specs

	GB200 NVL72	GB200 Grace Blackwell Superchip
Configuration	36 Grace CPU : 72 Blackwell GPUs	1 Grace CPU : 2 Blackwell GPU
FP4 Tensor Core	1,440 PFLOPS	40 PFLOPS
FP8/FP6 Tensor Core	720 PFLOPS	20 PFLOPS
INT8 Tensor Core	720 POPS	20 POPS
FP16/BF16 Tensor Core	360 PFLOPS	10 PFLOPS
TF32 Tensor Core	180 PFLOPS	5 PFLOPS
FP32	6,480 TFLOPS	180 TFLOPS
FP64	3,240 TFLOPS	90 TFLOPS
FP64 Tensor Core	3,240 TFLOPS	90 TFLOPS
GPU Memory \| Bandwidth	Up to 13.5 TB HBM3e \| 576 TB/s	Up to 384 GB HBM3e \| 16 TB/s
NVLink Bandwidth	130TB/s	3.6TB/s
CPU Core Count	2,592 Arm Neoverse V2 cores	72 Arm Neoverse V2 cores
CPU Memory \| Bandwidth	Up to 17 TB LPDDR5X \| Up to 18.4 TB/s	Up to 480GB LPDDR5X \| Up to 512 GB/s

GB200 NVL72: Tool for Next-Gen AI and Scientific Discovery

NVIDIA GB200 NVL72

Instantaneous LLM Deduction

Large-Scale Instruction

Infrastructure with Low Energy Use

Data Entry

GB200 NVIDIA NVL36 and NVL72

NVIDIA GB200 NVL72 Features

Architecture by Blackwell

NVIDIA Grace Processor

Fifth-Stage NVIDIA NVLink Technology

Graphics Processing Unit

Nvidia GB200 NVL72 price

NVIDIA GB200 NVL72 Specs

Modern Art of Bahia Museum’s Unique Heritage Collection

Fitbit Sleep Data Links Health And Sleep In A Recent Study

Huawei Watch GT 5: Redefining Smartwatch Excellence

LEAVE A REPLY Cancel reply

Recent Posts

Modern Art of Bahia Museum’s Unique Heritage Collection

Fitbit Sleep Data Links Health And Sleep In A Recent Study

Huawei Watch GT 5: Redefining Smartwatch Excellence

Gemini’s Big Upgrade: 1.5 Flash, Faster Replies, More Access

Precision 7960 Tower & LLMs In Dell Precision Workstations

Updates to Azure AI, Phi 3 Fine tuning, And gen AI models

Popular Post

ASRock’s creative AMD FP6 series thin mini-ITX motherboard

ASUS ProArt PA602 The Most Elegant Computer Case!

Cardea Z540 SSD Revolutionizes Storage

What is Azure Policy in Microsoft Azure

MSI Motherboards with Intel Application Optimization

Boost Your Apps Now: Amazon ElastiCache Serverless Unveiled!

About Us

POPULAR CATEGORY