Using AMD EPYC with AMD SMT to Increase Performance and Efficiency
A strong emphasis on energy conservation and performance predictability is one of the key pillars that Arm-based processor companies frequently point to as a competitive advantage over x86 CPUs. The ability to run on multiple threads simultaneously, which most enterprise-class CPUs have enabled for years under the technology description of “SMT,” which was also created in order to enable performance and efficiency benefits, has been largely designed out by Arm vendors in their quest for greater efficiency and performance.
Arm vendors frequently assert that AMD SMT increases the cost and energy required to install, poses security issues, and causes performance unpredictability due to shared resource contention. It’s interesting to note that Arm’s Neoverse E1-class processor family for embedded applications like automobiles does offer multi-threading. In light of these discrepancies.
What is AMD SMT?
A technique called Simultaneous Multithreading (SMT) enables a CPU core to run several threads at once. Since its introduction, AMD SMT has been incorporated into numerous contemporary CPUs with different thread counts. The most popular method is 2-way AMD SMT, in which each thread runs to completion serially, as illustrated below, or two threads run concurrently per CPU core.

Advantages of SMT use:
AMD SMT is a common CPU feature because it provides a number of efficiency and performance advantages:
Improved core resource utilization
SMT dynamically interleaves instructions from two threads across shared execution resources to keep cores occupied. A CPU core should ideally run instructions continuously without interruptions, but in practice, core stalls frequently happen, such as when a branch mispredicts or when waiting for data from memory following a cache miss. By enabling a second thread to utilise shared core resources while the other thread is delayed or otherwise awaiting data, AMD SMT helps close these gaps.
Increased throughput
When two threads are executed simultaneously, more instructions can flow through core pipelines in parallel, increasing Instructions Per Cycle (IPC) and improving performance overall.
Energy efficiency
AMD SMT can boost performance without appreciably raising the total power consumption of the processor. This results in notable improvements in energy efficiency for a variety of workloads.
the ability to boost capacity or performance without having to pay more for physical core-based licensing.
Software support
Over the course of its more than two decades in existence, SMT has been adopted and supported by the software ecosystem, which includes cloud computing, enterprise applications, and video games. By allocating threads in an appropriate manner according to each CPU’s core organisation and NUMA domains, all contemporary operating systems are designed to support AMD SMT and do so successfully. Although there is no work required to support it, software authors have the option to optimise their programs and get more performance and energy efficiency out of AMD SMT. SMT is transparent to high-level software and operates right out of the box.
Flexibility
SMT can be turned on or off during runtime in Linux or in the system BIOS for more permanent adjustments, giving the administrator the freedom to select the configuration that best suits workload requirements.
Challenges in SMT design:
Although AMD SMT significantly improves a core’s performance, semiconductor and system vendors must also overcome the following hardware design challenges:
Increasing attack surface
Almost every element of any system component must be viewed as an attack surface, and system and semiconductor vendors devote a substantial amount of effort to identifying possible weaknesses during the course of the product lifetime. Because AMD SMT allows core resource sharing across the two threads, it is a prime candidate for vulnerabilities like side-channel attacks. Features that interact with highly privileged system resources are subject to the greatest levels of testing and inspection.
CPU and systems suppliers have recognised and addressed these dangers during SMT’s two decades of existence by updating firmware and making changes to the core designs to eradicate them in later generations. The AMD Infinity Guard has security features including Secure Encrypted Virtualisation with Nested Paging (SEV-SNP) that help prevent side-channel attacks using SMT. Furthermore, AMD consistently collaborates with the software community to find and fix any fresh possible security flaws in the CPU feature set.
Fair sharing of core resources for both threads
Ensuring a fair distribution of core resources while maintaining high performance for both threads is another difficulty. In order to effectively schedule instructions from both threads while sharing the core’s resources, CPU architects must choose which resources will be shared. Subsequent generations of “Zen” build upon the same concepts as the original, which was created from the ground up as an AMD SMT-ready core:
- While the other thread is sleeping, the running thread has access to all resources.
- When the other thread stalls, each thread can make full use of pipeline resources.
- The majority of the core’s resources are divided between the two threads in a competitive manner when SMT is enabled.

How “expensive” is it to implement SMT?
From the standpoint of the end user, using AMD SMT has no tangible “cost” because it is an integrated feature that the majority of x86 users can freely activate or deactivate. However, in the very real world of semiconductor economics, everything that uses silicon transistor space or energy to operate is considered a cost. Additionally, the gains that SMT facilitates readily offset the minimal cost of implementation. For instance, the most recent AMD “Zen 4” and “Zen 5” cores require less than 5% of their core space to perform Simultaneous Multithreading (SMT).
This contains all of the code required to let two threads share the resources of the core. SMT provides up to 384 threads while using less silicon area than 10 physical cores, which is a great return on investment, according to simple “manager math.” Furthermore, having the additional performance and capacity made possible by virtual cores and threads might result in significant cost savings in situations where software licensing is based on the number of physical cores in the system! Let’s now debunk that annoying notion about energy use.
SMT Enables Performance and Efficiency
Hundreds of performance and efficiency world records have been set by AMD EPYC processors. Some of these workloads, including several HPC and technical computing applications, greatly benefit from multithreading and AMD SMT, whereas others do not. Assume for the moment that it would like a distinct, comprehensive evaluation of the ways in which AMD provides the goods and where SMT adds value. Perhaps the most thorough and reliable analysis of SMT’s value has been conducted by independent testing company Phoronix. As seen in below chart, the most recent test results for the AMD EPYC 9005 CPUs based on “Zen 5” demonstrated significant performance gains on a variety of evaluated workloads, such as databases, encryption, and compression workloads.

Given that a previous Phoronix analysis of SMT using AMD EPYC 9754 systems from older generations found comparable performance and power efficiency increases, these results are not shocking. Note that this website will offer a rather thorough detailed analysis of the 170 different tests for individuals interested in workloads outside of the domains included in this table. Many workloads in technical and high-performance computing gain incremental performance when AMD SMT is enabled, even though some workloads do appear to prefer having exclusive use of all physical core resources.
Crucially, Phoronix found little to no change in power consumption between 4th and 5th generation EPYC CPUs when SMT was enabled and when it was deactivated, across a wide range of workloads.
“Ad AMD EPYC 9005 processors continue to be a clear winner for tasks that may benefit from AMD SMT. The data here indicates no overall difference in CPU power consumption between having SMT enabled and doing all 170+ test, which take about 13 hours to complete.
With almost no or very little change in power consumption and the substantial SMT performance gains (often between 30 and 50 percent), energy efficiency is being improved better performance per watt! Modern x86 superscalar CPUs like AMD EPYC have SMT, power management, and dynamic frequency scaling as key components that contribute to energy efficiency. The advantages are summed up in the following remarks:
“On average, AMD SMT enabled on the AMD EPYC 9575F increased CPU power consumption by only 2 Watts compared to when it was disabled.”
Why is efficiency evident? A thread stalling while awaiting data does not cause a core to switch to a lower power state to conserve power when it is in the regular operating state (C0) carrying out instructions, but having a second thread to fill in the gaps can significantly improve speed. While power efficiency increases significantly, the higher instruction throughput may result in a modest rise in power consumption.
AMD EPYC and SMT: Still delivering great value after all these years
At a time when core resources were extremely valuable one, two, or even four cores per socket simultaneous multithreading was created so that users could get as much processing power as possible. Given that AMD EPYC processors now have up to 192 physical high-performance Zen 5 cores per socket, one would wonder if SMT is still valuable and if these resources are really so valuable. A loud “yes” is likely to be given by any IT manager who is having difficulty juggling budgets and the astounding surge in demand for computing resources.
Even if physical cores are becoming more and more common, they are still quite important because there is always a lot of work to be done and software license costs are frequently correlated with the host server’s physical core count. Making the most of every resource is crucial for the average IT shop, and being able to increase compute capacity and performance incrementally while using the least amount of hardware resources possible can yield a significant return on investment. AMD SMT is a strong choice since it can be readily turned off when it is not needed and allows for a comparatively “free” performance improvement when it is.