Breaking Barriers Intel’s Latest Supercomputers Progress

November 14, 2023

139

Page Contents

What’s Fresh: Intel’s Latest Supercomputers

With leading performance for HPC and AI workloads spanning Intel Data Center GPU Max Series, Intel Gaudi2 AI accelerators, and Intel Xeon CPUs, Intel demonstrated AI-accelerated high performance computing (HPC) at SC23. Intel provided an update on the 1 trillion parameter GPT-3 LLM on the Aurora supercomputers, which is made possible by the special architecture of the Max Series GPU and the system capabilities of the Aurora supercomputers, as part of their collaboration with Argonne National Laboratory on the Aurora generative AI (gen AI) project.

Using applications from the Exascale Computing Project and the Aurora Early Science Program (ESP), Intel and Argonne showcased how science is accelerating. The business also demonstrated the route to Falcon Shores and Intel Gaudi3 AI accelerators.

Why It Is Important:

The current performance and benchmark results, in conjunction with generative AI for research, highlight Intel’s capacity to provide customized solutions that address the unique requirements of HPC and AI clients. With oneAPI, HPC, and AI-enhanced toolkits, Intel’s software-defined approach enables developers to easily transfer their code across architectural frameworks, speeding up scientific research. Additionally, a number of supercomputers that are going online will use GPUs and CPUs from the Max Series.

Regarding Generative AI for research:

The Aurora supercomputers and Argonne National Laboratory collaborated on projects related to gen AI for research. The goal of the Aurora gen AI project is to develop cutting-edge fundamental AI models for research in partnership with Argonne, Intel, and other partners. At sizes of more than one trillion parameters from various scientific disciplines, scientific books, code, and research datasets will be used to train the models. The gen AI project will support several scientific fields, including biology, cancer research, climate science, cosmology, and materials science, by utilizing the fundamental technologies of Megatron with DeepSpeed.

With only 64 nodes far less than would normally be needed the unique Intel Max Series GPU architecture and the Aurora supercomputers system capabilities can effectively handle 1 trillion-parameter models. The capacity to operate several instances in parallel on Aurora was shown by Argonne National Laboratory running four instances on 256 nodes. This paves the way for scaling the training of trillions of parameter models with trillions of tokens on more than 10,000 nodes more quickly.

About Intel and Argonne National Laboratory:

Intel and Argonne National Laboratory showcased how the system capabilities and software stack on Aurora enable the acceleration of science at scale. Examples of workloads include:

Connectomics ML enables brain connectome reconstruction at scale, demonstrating competitive throughput for inference on over 500 Aurora nodes.

Comparing the Intel Max GPU against the Nvidia A100, the General Atomic and Molecular Electronic Structure System (GAMESS) demonstrated competitive performance that was nearly two times faster. This allows the Aurora supercomputers to be used to model complex chemical reactions in drug and catalyst creation in order to reveal the mysteries of molecular science.

The Hardware/Hybrid Accelerated Cosmology Code (HACC) has exhibited runs on over 1,500 Aurora nodes, making it possible to visualize and comprehend the laws of physics and the universe’s evolution.

As part of the Aurora Drug Discovery early science project (ESP), the drug-screening AI inference program allows the screening of over 20 billion of the most produced compounds on just 256 nodes, resulting in the efficient screening of large chemical datasets.

Additionally, Intel demonstrated improved AI and HPC performance along with software enhancements for both hardware and applications:

Findings for STAC-A2, an impartial benchmark suite based on real-world market risk analysis workloads, were released by Intel and Dell. These findings demonstrated excellent performance for the financial sector. Four Intel Data Center GPU Max 1550s demonstrated 4.3x greater space economy and 26% higher warm Greeks 10-100k-1260 performance as compared to eight Nvidia H100 PCIe GPUs.

On a variety of HPC tasks, the Intel Data Center GPU Max Series 1550 beats the Nvidia H100 PCIe card by an average of 36% (1.36x).

Improved support for AI models, including many large language models (LLMs) like GPT-J and LLAMA2, is provided by the Intel Data Center GPU Max Series.

The only x86 processor with high bandwidth memory (HBM), the Intel Xeon CPU Max Series, outperformed the AMD Epyc Genoa processor in terms of average performance by 19%.

The results of the industry-standard MLPerf training v3.1 benchmark for AI model training were released by MLCommons2 last week. On the v3.1 training GPT-3 test, Intel Gaudi2 showed a notable 2x performance increase by implementing the FP8 data format.

In 2024, Intel will introduce the Intel Gaudi3 AI accelerators. Based on the same high-performance architecture as Gaudi2, the Gaudi3 AI accelerator is anticipated to deliver four times the compute (BF16), twice the networking bandwidth for improved scale-out performance, and 1.5 times the on-board HBM memory in order to easily meet the increasing demand for LLM compute that is both high-performance and high-efficiency without sacrificing performance.

According to LAMMPS-Copper, 5th generation Intel Xeon processors will perform 1.4 times better on HPC workloads than previous generations.

An increase in core count and integrated acceleration with support for multiplexer combined ranks (MCR) DIMMs and Intel Advanced Matrix Extensions are features of the next Intel Xeon processor Granite Rapids. The AI inference for DeepMD+LAMMPS will be 2.9 times better in Granite Rapids. 8,800 megatransfers per second based on DDR5 and more than 1.5 terabytes per second of memory bandwidth capabilities in a two-socket system are attained by MCR, which is essential for powering the rapidly increasing number of cores in current CPUs and facilitating speed and flexibility.

Regarding Recent Developments in OneAPI:

Intel has revealed capabilities for its 2024 software development tools that push the boundaries of open software development using multiarchitecture programming in OneAPI. With the use of new tools, developers can expand the capabilities of AI and HPC on Intel CPUs and GPUs. These tools include faster performance and deployments for numerical workloads using standard Python, as well as compiler enhancements that provide a near-complete implementation of SYCL 2020 to increase productivity and offload code.

Furthermore, the oneAPI Center of Excellence at Texas Advanced Computing Center (TACC) will concentrate on developing and refining seismic imaging benchmark codes. With 32 API Centers of Excellence throughout the globe, Intel creates an environment where hardware and software innovation and research enhance the industry.

Next Steps:

Intel underlined the market’s momentum and its dedication to AI and HPC. Systems like Aurora, Dawn Phase 1, SuperMUC-NG Phase 2, Clementina XX1, and others are examples of new supercomputers deployments incorporating Intel Max Series GPU and CPU technology. One of the new systems with Intel Gaudi2 accelerators is a massive AI supercomputers that is the anchor customer for Stability AI.

Falcon Shores, Intel’s next-generation GPU for AI and HPC, will build on this momentum. Falcon Shores intends to utilize the intellectual property (IP) of Intel Gaudi and Intel X in conjunction with a single GPU programming interface constructed on oneAPI. Future applications developed on Intel Max Series GPUs and Intel Gaudi AI accelerators will be able to transition to Falcon Shores with simplicity.

2 COMMENTS

Tech Innovation Meets Sunshine: The Intel Miami Phenomenon November 23, 2023 At 7:02 pm

[…] Intel Miami phenomenon promises to transform many facets of our daily life, from AI-powered smart homes […]

Reply
Decoding Ericsson's New AI And 5G Advanced RAN Compute November 29, 2023 At 9:56 am

[…] to RAN Compute, Ericsson Networks has unveiled a “enhanced portfolio” that utilizes Intel’s 4 process node […]

Reply

Breaking Barriers Intel’s Latest Supercomputers Progress

What’s Fresh: Intel’s Latest Supercomputers

Why It Is Important:

Regarding Generative AI for research:

About Intel and Argonne National Laboratory:

Regarding Recent Developments in OneAPI:

Next Steps:

Zurich Instruments launches SHF+ quantum computing platform

How Gemini Models Craft Winning Marketing Efforts

Ultimate Guide to Supply Chain Security Best Practices

2 COMMENTS

LEAVE A REPLY Cancel reply

Recent Posts

Zurich Instruments launches SHF+ quantum computing platform

How Gemini Models Craft Winning Marketing Efforts

Ultimate Guide to Supply Chain Security Best Practices

How Mantle Simplifies Equity Management with Gemini

Galaxy Technologies Gains Accessible Communication

SYCL Capable Multi-Layer Perceptrons for Intel GPU

Popular Post

Cardea Z540 SSD Revolutionizes Storage

ASUS ProArt PA602 The Most Elegant Computer Case!

What is Azure Policy in Microsoft Azure

Qualcomm Cloud AI 100 Ultra Launch : Latest Tech Wonder!

Boost Your Apps Now: Amazon ElastiCache Serverless Unveiled!

MSI Motherboards with Intel Application Optimization

About Us

POPULAR CATEGORY