Optimizes Enterprise AI Deployments with a New NVIDIA Storage Partner Validation Pro
For NVIDIA OVX computing systems, international businesses can now choose from a variety of NVIDIA-validated storage options. Business innovation is being driven by a sharp increase in the deployment of generative AI for enterprises across various industries. However, it’s also creating a lot of work for their IT teams, as lengthy and intricate infrastructure deployment cycles slow them down and make it difficult for them to quickly spin up AI workloads using their own data.
OVX NVIDIA
For NVIDIA OVX computing systems, NVIDIA has launched a storage partner validation program to aid in overcoming these obstacles. Leading the charge in finishing the NVIDIA OVX storage validation are the high-performance storage systems: DDN, Dell PowerScale, NetApp, Pure Storage, and WEKA.
NVIDIA OVX servers
NVIDIA OVX servers are designed to handle a variety of complex AI and graphics-intensive workloads by combining high-performance, GPU-accelerated compute with low-latency networking and high-speed storage access. For example, chatbots, summarization, and search tools need a lot of data, and high-performance storage is essential to maximizing system throughput.
The new program offers partners a standardized procedure to validate their storage appliances, assisting businesses in matching the appropriate storage with NVIDIA-Certified OVX servers. They can make use of the same testing procedures and framework required for validating storage for the NVIDIA DGX BasePOD reference architecture.
In order to obtain validation, partners are required to finish a series of NVIDIA tests that assess input/out scaling and storage performance across multiple parameters, representing the rigorous demands of diverse enterprise AI workloads. Combinations of various I/O sizes, thread counts, buffered vs. direct I/O, random reads, rereads, and other features are included in this.
Every test is conducted several times to confirm the outcomes and collect the necessary data, which is subsequently examined by NVIDIA engineering teams to ascertain whether the storage system has passed.
For enterprise AI workloads using NVIDIA OVX systems, the program provides prescriptive guidance to guarantee optimal storage performance and scalability. Customers can customize their system and storage options to match their current data center environments and add accelerated computing to any location where their data is stored, but the overall design is still adaptable.
IT teams must carefully consider their compute, networking, storage, and software choices in order to ensure high performance and scalability for generative AI use cases, which have fundamentally different requirements than traditional enterprise applications.
Enterprise-grade performance, manageability, security, and scalability for AI workloads are guaranteed by NVIDIA-Certified Systems, which undergo rigorous testing and validation. Compared to separately building from scratch, their adaptable reference architectures enable faster, more cost-effective, and more efficient deployments.
OVX servers are equipped with NVIDIA BlueField-3 DPUs, NVIDIA AI Enterprise software, and NVIDIA Quantum-2 InfiniBand or NVIDIA Spectrum-X Ethernet networking, all powered by NVIDIA L40S GPUs. Their efficiency lies in their ability to handle generative AI tasks such as training smaller LLMs (like Llama 2 7B or 70B), optimizing pre-existing models, and performing high-throughput, low-latency inference.
Global system vendors such as GIGABYTE, Hewlett Packard Enterprise, and Lenovo are now shipping NVIDIA OVX servers. Each system builder works with NVIDIA to provide comprehensive, enterprise-grade support for these servers.
Accessibility
Reference architectures will be released by all storage and system vendors in the upcoming weeks, in addition to validated storage solutions for NVIDIA-Certified OVX servers.
Systems Certified by NVIDIA
The systems listed here have undergone testing using the newest NVIDIA networking and GPUs, and their performance, functionality, scalability, and security have all been assessed by NVIDIA engineers.
NVIDIA-Certified Systems
To assist its partners in delivering the best-performing systems, the NVIDIA-Certified Systems program has put together the most comprehensive set of accelerated workload performance tests in the industry. NVIDIA engineers test NVIDIA-Certified Systems using the strongest enterprise NVIDIA GPUs and networking, assessing them for functionality, performance, scalability, and security. Enterprises can swiftly deploy optimized platforms for AI, Data Analytics, HPC, high-density VDI, and other accelerated workloads in the data center, at the edge, and on desktops thanks to NVIDIA-Certified Systems, which have been shown to deliver predictable performance.
In addition to servers intended for data centers, GPU-powered workstations, high-density VDI systems, and Edge devices are now included in NVIDIA’s NVIDIA-Certified Systems program. Systems with NVIDIA certification for the data center undergo testing in both single- and dual-node configurations. The NVIDIA-Certified Systems program evaluates workstations, high-density VDI, and Edge systems using NVIDIA GPUs in a single system.
NVIDIA NGC Catalog software is network-optimized and GPU-accelerated for AI and other compute-intensive tasks. It speeds deployments and time-to-solution with carefully selected containers, pre-trained models, resources, SDKs, and Helm charts Numerous NVIDIA GPU-accelerated platforms, such as NVIDIA DGX Systems, on-premises servers from NVIDIA partners, and top cloud platforms, are compatible with NGC software.
Software from the NGC Catalog is used to test NVIDIA-Certified Systems, and NVIDIA AI Enterprise Support Services offers enterprise-grade support for a fee to customers. Direct access to NVIDIA subject matter experts through NVIDIA AI Enterprise Support Services enables them to promptly resolve software problems, minimizing system downtime and optimizing system utilization and user productivity.
Although customers can purchase and install NVIDIA-Certified systems for the data center using any type of networking, they are tested using NVIDIA networking. NVIDIA AI Enterprise Support Services are resold by NVIDIA partners. These services are offered to NVIDIA-Certified systems installed in any data center and equipped with any kind of network adapter.
Systems Testing Certified by NVIDIA
Systems that have earned NVIDIA certification have successfully passed a demanding battery of functional and performance tests. Systems with NVIDIA certification for the data center undergo testing in both single- and dual-node configurations. In the NVIDIA-Certified Systems program, workstations, high-density VDI systems, and Edge devices are assessed based on how well they function independently when using NVIDIA GPUs in a single system.
Measures taken during NVIDIA-Certified Systems testing of Edge devices, workstations, High-Density VDI systems, and data center servers include
- TensorFlow and PyTorch for single and multi-GPU Deep Learning training performance
- High throughput, low latency inference with TRITON and NVIDIA TensorRT
- GPU-Powered Machine Learning & Data Analytics with RAPIDS
- Developing applications with the NVIDIA CUDA Toolkit and NVIDIA HPC SDK
Data center servers certified by NVIDIA are examined for
- Performance of Multi-Node Deep Learning Training
- Fast packet processing, low latency networking, and large bandwidth
- Security at the system level and hardware-based key management
System Compatibility for NVIDIA AI Enterprise
Versions 2.0 and higher of NVIDIA AI Enterprise enable both virtualized and bare metal deployments. For bare metal deployments, all NVIDIA-Certified Data Center Servers and NGC-Ready servers with qualified NVIDIA GPUs are NVIDIA AI Enterprise Compatible. The following is a list of NVIDIA-Certified systems that NVIDIA has approved for use in a VMware vSphere environment.
The VMware Compatibility Guide contains a list of all partner systems’ NVIDIA GPUs that VMware vSphere supports. See the Red Hat Certified Hardware page for supported systems when deploying Red Hat Enterprise Linux bare metal.
The NGC-Ready systems documentation site contains a list of NGC-Ready systems that are compatible with NVIDIA AI Enterprise.
Technical documentation for NVIDIA AI Enterprise
The end-to-end, cloud-native software platform NVIDIA AI Enterprise speeds up data science workflows and simplifies the creation and implementation of production-grade co-pilots and other generative AI applications. For businesses that rely on artificial intelligence, simple-to-use microservices offer optimal model performance along with enterprise-grade security, stability, and support, facilitating an easy transition from prototype to production.
List of Systems with NVIDIA Certification – OVX Servers
The OVX Servers with the supported NVIDIA GPUs indicated in the following table are among the systems that NVIDIA has certified as NVIDIA-Certified Systems. Bluefield-3 3220 is used to test OVX Servers for Zero Trust deployments.
Partner | OVX Server | Supported NVIDIA GPUs | Supported NVIDIA Network Device | Zero Trust Compatible |
---|---|---|---|---|
GIGABYTE | G493-SB0-A | L40S, L40 | CX7 | Yes |
HPE | Proliant DL385 Gen11 | L40S, L40 | CX7 | Yes |
HPE | ProLiant DL380a Gen11 | L40S, L40 | B3140 | Yes |
Lenovo | ThinkServer SR675 v3 | L40S, L40 | CX7 | Yes |
List of Data Center Server Systems with NVIDIA Certification
The following systems, which are data center servers with the supported NVIDIA GPUs indicated in the following table, have been verified by NVIDIA as NVIDIA-Certified Systems. All widely available GPU versions are certified for use with Data Center Servers using the GPUs listed below. As an illustration, every system that has been verified for the A100 GPU is approved for the A100 40 GB and A100 80 GB GPU variants as well.
List of Workstations Certified by NVIDIA
Workstations are NVIDIA-Certified Systems.
NVIDIA-Certified data science workstations pack optimized hardware and NVIDIA CUDA-X AI-based software. Employing enterprise-class GPUs, they expedite data science workloads and have been verified for peak performance, dependability, and alignment with NVIDIA services and applications.
List of Mobile Workstations with NVIDIA Certification
Mobile Workstations are NVIDIA-Certified Systems.
NVIDIA-Certified workstations have optimized hardware and NVIDIA CUDA-X AI-based data science software stacks.. Employing enterprise-class GPUs, they expedite data science workloads and have been verified for peak performance, dependability, and alignment with NVIDIA services and applications.
Software Supported by NVIDIA Certified Systems
NVIDIA-Approved Software Environment for Systems Testing
Systems with NVIDIA certification have undergone testing in a software environment that is standardized and offers the best performance and stability. Production versions of the following are currently used in the NVIDIA-Certified systems software test environment:
- NVIDIA drivers for Ubuntu 20.04 MLNX_OFED network adapter drivers
- NVIDIA Cloud Native Core utilizing Kubernetes and NVIDIA GPU Operator