IBM z17: AI Power with Spyre Accelerator and Telum II

April 8, 2025

120

IBM z17

The IBM Spyre Accelerator and IBM Telum II processor are at the core of the IBM z17. These machines, which are the result of years of work by numerous researchers, are made to perform AI tasks at a speed and scale never before possible.

The newest mainframe version, z17, which drives businesses and manages 70% of global financial activities, was released, according to IBM. One The Spyre Accelerator, which will be released in Q4 2025, and the embedded AI accelerator core of the Telum II processor were the brains behind the new system’s AI accelerators, which were designed and co-developed at IBM Research.

IBM z17 will usher in a new era of enterprise AI applications with these potent components. Z17 with Spyre will allow companies to execute generative AI models and agentic AI on-site in sectors where data must be safe but still accessible with the least amount of latency.

The 32-core Spyre accelerator for the IBM z17 will be offered as an extra PCIe card, with the option to add more cards as required. Telum, which powers the z16 system and has an industry-first on-chip AI accelerator, is the foundation for Spyre and Telum II. Spyre, an AI accelerator that leverages low-precision computation and AI-centric design for low-latency inferencing, was built on top of that accelerator.

The IBM Research team that created the Telum II on-chip and Spyre accelerators overcame a significant obstacle in the design of these potent devices: delivering the power of AI to IBM’s infrastructure clients at a never-before-seen speed and scale. Meanwhile, AI is still developing. They created something to manage workloads for tomorrow as well as today, regardless of how particular AI models become or become less relevant.

Software and hardware co-design, which is the outcome of cooperation between IBM teams and direct input from IBM Z clients, is how Telum II and Spyre accomplish their AI inference capabilities. The core of IBM z17 is a whole software stack, and striking the correct balance between hardware and software innovation was essential to the creation of Spyre and Telum II.

And the outcomes are self-evident. A prototype Spyre processed over three times as many photos per second per watt of electricity in early tests compared to top-tier GPUs. The researchers realized they were on to something, considering the massive estimates of energy needed to power AI workloads.

The Tellum II processor, mounted on a dual-chip module, in the new IBM z17 — Image credit to IBM

An AI-specific chip

The processing power needed for AI is far higher than that of regular applications. In order to focus on meeting the upcoming energy demands of AI with more efficient technology, IBM Research established the IBM Research AI Hardware Center in 2019. IBM Research has been working diligently on AI-specific chips for nearly ten years.

Low-precision computing is a component of an AI hardware technique that can significantly increase the power efficiency of computers that do AI calculations. The path to Spyre started in 2015 with a two-page white paper at IBM Research, long before the AI Hardware Center was ever established. Burns and his team’s brief report gave his team permission to look into the viability of using approximate computing for deep learning.

They believed that rather than attempting to execute approximate computation on the available GPUs or CPUs, creating low-precision hardware from the ground up would result in better power performance for deep learning.

GPUs have been widely used for AI workloads because of their ability to run several processes in parallel. Burns and his associates, however, felt that something more was obviously required. This concept has already been demonstrated with GPU graphics processing, which is more generic than AI requirements but less general than CPUs. Why not a compute core made just for deep learning, then?

This concept aligned with what IBM Infrastructure customers desired from IBM Z, including improved security, virtualization capabilities, and AI inferencing. Keeping such factors in mind, the researchers optimized a chip. It was dubbed an artificial intelligence unit, or AIU, by researchers. After Burns and his group created their initial functional version, other IBM Infrastructure team members many of whom worked in software began contributing to the development of the entire package.

On a dual-chip module, Telum II builds on the success of the first Telum, a z16 on-chip accelerator. — Image credit to IBM

Developed to handle the workloads of the future

Given that circuits take years to create and workloads are changing rapidly, timeline is a significant barrier in AI chip design. According to Chang, Watsonx has therefore been a beacon of hope amidst the ebb and flow of artificial intelligence. In as little as two months, the team’s original purpose of creating Spyre to optimize for a particular AI inference benchmark would completely shift. Chang claims that this has been the most exhilarating experience of his career.

The Watsonx team’s AI Roadmap, which was created years prior, offered crucial direction: It was hypothesized in 2025 that specialized hardware would enable generative AI to scale in novel ways, maybe going beyond transformers. Furthermore, it forecasted the prominence of strong, strategic reasoning models in 2026. Because they are so familiar with their clients’ demands and continue to predict what they may require years from now, the IBM Infrastructure team has also been quite helpful.

The new AI workloads that IBM z17 clients will bring to the platform are intended to be handled by the Spyre accelerator. Instead of models that the field is cooling on, such classification models, it is optimized for generative and agentic AI, for instance.

In addition, although AI models have largely grown in size over the last ten years, smaller, more functional models are also becoming more popular. At the same time, a variety of expert and state space models are becoming more prevalent in the industry; their full potential and appropriate applications are still being investigated. The roadmap that aided in the development of Spyre, which has these features built in, reflects all of these 852advancements. Mainframes are crucial in today’s sectors, such as healthcare and finance. With over 250 use cases for IBM Z, including improved financial fraud detection, money laundering prevention, and credit risk assessment, to mention a few, the list of AI use cases is continuously growing.

Accessibility

The general release date of the IBM Z17 is June 18, 2025. To learn more, go to IBM.com/z17. It is anticipated that the IBM Spyre Accelerator will be accessible beginning in Q4 2025.

Statements on IBM’s future direction and intent are merely goals and objectives and are subject to change or withdrawal at any time.

IBM z17: AI Power with Spyre Accelerator and Telum II

IBM z17

An AI-specific chip

Developed to handle the workloads of the future

Accessibility

File Sync Azure: New Updates Announced by Microsoft

Entrust Cryptographic Security Platform Aids Cyberattacks

PEP Policy Enforcement Point For AWS Verified Permissions

LEAVE A REPLY Cancel reply

Page Content

Recent Posts

NASA launch First Space-Based Quantum Gravity Gradiometer

File Sync Azure: New Updates Announced by Microsoft

Codex CLI Grant: Building The Code With OpenAI Models

HRL Laboratories Boeing Quantum Space Mission Key Validation

Entrust Cryptographic Security Platform Aids Cyberattacks

What is Martech Solutions And Generative AI In Marketing

About Us

POPULAR CATEGORY