Saturday, July 27, 2024

Gigabyte G593-SD0: HPC Workloads with NVIDIA HGX H100

G593-SD0 Product Overview

Leading Efficiency and Quickening Artificial Intelligence

In the process of transforming businesses, Intel has achieved a remarkable leap in CPU performance by designing more functions into a ground-breaking platform. The integrated AI acceleration engines in the 4th and 5th generation Intel Xeon Scalable processors will help with AI and deep learning performance, while additional specialised accelerators are used for networking, storage, and analytics. The new Intel Xeon processor families will offer even better CPU performance and performance per watt, along with a slew of new features to target a wide range of workloads.

They will also do this on a PCIe 5.0 platform with twice the previous generation’s throughput, which will significantly speed up data movement to and from GPUs and storage. For memory-bound HPC and AI tasks, Intel has developed the Intel Xeon CPU Max Series with High Bandwidth Memory (HBM). With devices designed to maximise the performance of Intel Xeon CPU-based systems that enable fast PCIe Gen5 accelerators, Gen5 NVMe SSDs, and highly performant DDR5 memory, GIGABYTE is prepared for this new platform.

Utilising Intel Xeon CPU Max Series and 4th and 5th generation scalable processors, redefine performance.

Integrated Accelerators

Boost ROI while enhancing effectiveness and output

64 cores maximum

Dedicated cores allow for faster and simultaneous completion of more activities.

DDR5 compatibility

With DDR5 RDIMM support, memory frequency and throughput are increased.

64GB HBM

Large bandwidth in-package memory significantly accelerates applications involving a lot of data.

PCIe 5.0 lanes

Improved I/O throughput with support for the newest SSDs and accelerators, achieving 128GB/s bandwidth

Support for CXL 1.1

Boost efficiency with direct accelerator memory access and memory coherency

Why Choose for GIGABYTE Servers for Liquid Cooling?

Amazing Performance

Due to the great performance of liquid-cooled components that run well below CPU TDP, servers will operate with exceptional stability.

Energy Conservation

An air-cooled server can be outperformed by a liquid-cooled server using less fans and operating at lower speeds, all while using less electricity.

Reduced Noise

Numerous loud, high-speed fans are needed for servers. With fewer fans and a liquid cooling method, GIGABYTE has discovered a way to cut down on noise.

A track record of success

With 20 years of experience, the direct liquid cooling system supplier has provided cooling solutions for desktop PCs and data centers. Additionally, GIGABYTE has over 20 years of experience.

Dependability

Maintenance requirements for liquid cooling solutions are minimal and they can be observed. Suppliers of liquid cooling and GIGABYTE offer component warranties.

Usability

Servers from GIGABYTE that come with a liquid cooling kit can be installed in a rack or linked to the water system of a structure. and provides dry, simple, and fast disconnects.

Elevated Efficiency

Compatible with NVIDIA HGX H100 8-GPU

Large-scale AI and HPC may now advance orders of magnitude thanks to the NVIDIA H100 Tensor Core GPU, which offers unmatched performance, scalability, and security for every data centre. NVIDIA NVLINK Switch System allows direct communication between up to 256 GPUs, and NVIDIA AI Enterprise streamlines the development and deployment of AI. The H100 accelerates everything from right-sized Multi-Instance GPU (MIG) partitions to exascale scale workloads with a dedicated Transformer Engine for trillion parameter language models.

Energy Efficiency

Controlled Fan Speed Automatically

Automatic Fan Speed Control is enabled on GIGABYTE servers to provide optimal cooling and power efficiency. Temperature sensors positioned thoughtfully throughout the servers will automatically alter each fan speed.

Elevated Availability

Ride-through Smart (SmaRT)

In order to guard against data loss and server outages due to AC power outages, GIGABYTE has included SmaRT into all of our server platforms. The system will throttle in response to such an occurrence, maintaining availability and lowering power consumption. The power supply’s capacitors can provide power for 10–20 ms, which is sufficient time to switch to a backup power source and continue operating.

SCMP stands for Smart Crises Management and Protection

A patented feature of GIGABYTE, SCMP is used in servers with a non-fully redundant PSU design. With SCMP, the system will put the CPU into an ultra-low power mode to lower the power load, preventing an unplanned shutdown and preventing component damage or data loss in the event of a malfunctioning PSU or overheated system.

Architecture with Dual ROM

The backup BMC and/or BIOS will take over the primary BIOS upon system reset if the ROM containing the BMC and BIOS is unable to boot. The backup BMC’s ROM will automatically update the backup through synchronisation as soon as the primary BMC is updated. The BIOS can be upgraded according to the firmware version selected by the user.

Hardware Safety

TPM 2.0 Module Option

Passwords, encryption keys, and digital certificates are kept in a TPM module for hardware-based authentication to keep unauthorised users from accessing your data. There are two types of GIGABYTE TPM modules: Low Pin Count and Serial Peripheral Interface.

Easy to Use

Tool-free Drive Bays Style

A clipping mechanism holds the drive firmly in position. It takes seconds to install or swap out a new drive.

Accredited and Prepared with Software Partners

GIGABYTE’s ability to quickly develop and evaluate collaborative solutions as a member of important software alliance partner programmes helps our customers modernise their data centres and deploy IT infrastructure and application services quickly, nimbly, and cost-effectively.

Management with Added Value

Gigabete provides free management programmes with a dedicated tiny CPU integrated into the server.

Console for GIGABYTE Management

The GIGABYTE Management Console is pre-installed on every server and may be used for managing and maintaining a single server or a small cluster. After the servers are up and running, the browser-based graphical user interface allows IT workers to monitor and manage each server’s health in real time. Furthermore, the GIGABYTE Management Console offers:

  • Support for industry-standard IPMI specifications that enable users to combine services via an open interface into a single platform
  • Automatic event recording makes it simpler to decide what to do next by capturing system behaviour up to 30 seconds before an event happens.
  • To monitor and manage Broadcom MegaRAID adapters, integrate SAS/SATA/NVMe devices and RAID controller firmware into the GIGABYTE Management Console.

Management of GIGABYTE Servers (GSM)

A software suite called GSM is capable of managing several server clusters over the internet at once. GSM supports Windows and Linux and can be run on any GIGABYTE server. Downloadable from the GIGABYTE website, GSM conforms to Redfish and IPMI requirements. The following tools are among the full set of system administration features that are included with GSM:

  • GSM Server: Software that runs on an administrator’s PC or a server in the cluster to enable real-time, remote control via a graphical user interface. Large server clusters can have easier maintenance thanks to the software.
  • GSM CLI: A command-line interface designed for remote management and monitoring.
  • GSM Agent: An application that is installed on every GIGABYTE server node and interfaces with GSM Server or GSM CLI to retrieve data from all systems and devices via the operating system.
  • GSM Mobile: An iOS and Android mobile application that gives administrators access to real-time system data.
  • The GSM Plugin is an application programme interface that enables users to control and monitor server clusters in real time using VMware vCenter.

G593-SD0 Features

  • CPU+GPU Direct liquid cooling solution
  • Liquid cooled NVIDIA HGX H100 with 8 x SXM5 GPUs
  • 900GB/s GPU-to-GPU bandwidth with NVIDIA NVLink and NVSwitch
  • Dual 5th/4th Gen Intel Xeon Scalable Processors
  • Dual Intel Xeon CPU Max Series
  • 8-Channel RDIMM DDR5, 32 x DIMMs
  • Dual ROM Architecture
  • Compatible with NVIDIA BlueField-2 DPUs
  • 2 x 10Gb/s LAN ports via Intel X710-AT2
  • 8 x 2.5″ Gen5 NVMe/SATA/SAS-4 hot-swappable bays
  • 12 x LP PCIe Gen5 x16 slots
  • 1 x LP PCIe Gen4 x16 slot
  • 4+2 3000W 80 PLUS Titanium redundant power supplies
RELATED ARTICLES

Recent Posts

Popular Post

Govindhtech.com Would you like to receive notifications on latest updates? No Yes