Saturday, July 6, 2024

Leading Chiplet Manufacturers Worldwide

Chiplets Optimise Results by Bringing Data Closer

Lynn Comp, AMD CVP, Server CPU Marketing, spoke at Israel’s annual chip industry event ChipEx2023 this summer. She discussed how AMD is leading the way in improving data centre and cloud performance with chiplets. She also mentioned CXL interconnects and edge computing as ways to address the processor data location imbalance.

Compute, storage, and networking power the digital industry, but the slowest slows performance. As the industry evolves over decades, we speed up one engine to keep up with the other two, overcorrecting one technology and moving on to the next imbalance.

A mainframe chokepoint was compute. From the 1950s to the early 1980s, computers were scarce for complex scientific calculations and accounting.

Next was client/server, decentralised on PCs. The bottleneck was networking, making it hard to get data to PCs.

The Internet and mobile data networks accelerated networking later. Data storage for processing tasks became the issue.

This describes our current world. Mobile devices, edge computing, and the Internet of Things need fast data access, and local storage and network speeds are a major industry challenge.

he suspect another networking issue. Data latency from origin to edge is also difficult. Distance and resource connections determine latency on a motherboard, data centre, or cloud.

Some of our largest HPC customers say that every FLOPS of compute performance should be matched by 1 word of data to keep the pipeline fed. HPC systems today typically provide nearly 100 FLOPS per word transferred, they add. The lack of data prevents compute from performing at its best.

The problem is extreme in AI, where a prominent tech company driving interactive immersive AR and VR experiences says up to 57% of processing time is spent waiting for network data. GPU resources are too scarce to waste.

Where data resides and how to move it to processing pipelines is a complex challenge at every computing level: package, motherboard, server, data centre, and network-connected data centres.

Chiplets help solve the problem. Chiplets can replace expensive monolithic dies inside the package. Chiplets have multiple processor cores, and they can be added to a package to create higher-performance processors with extreme scalability and flexibility.

AMD pioneered chiplets for high-performance data centre workloads nearly five years ago, but now many data centre processor vendors offer them.

Over the years, AMD has demonstrated the capability and flexibility of chiplets. Our third-generation AMD EPYC processor used this flexibility to add AMD 3D V Cache technology in 2021, delivering 768MB of L3 memory. We saw superlinear scaling efficiency on workloads like ANSYS Fluent 2021 R1, F1 racecar 140m, delivering performance gains beyond what would be expected by increasing processor core count MLNX 041. Data was moved closer to programme cores to achieve this.

Looking ahead, multiple processor manufacturers are working with Universal Chiplet Interconnect Express (UCIe), an industry organisation developing standards to let semiconductor solution providers mix and match chiplets from different manufacturers. That should create intriguing scenarios.

Data timing and placement to maintain processing pipeline efficiency is a multifaceted issue. Compute Express Link (CXL), an industry standard for high-speed, high-capacity CPU-to-device and CPU-to-memory connectivity, addresses the system, node, and rack

CXL allows system-level memory and accelerator addition in a low-latency interface. CXL scales beyond 3D V Cache and UCIE. If 3D V Cache is a small CPU memory brick, CXL is a larger brick that allows much greater scale.

Edge computing reduces data travel by moving computing resources closer to the network edge, improving compute pipeline efficiency.

Complex engineering is needed to balance computing resources for pipeline efficiency. Engineering alone can’t solve global issues. Additionally, solutions must become market-wide. Often, the best performance and features solutions are not the market leaders in revenue or market share. Why?

small CPU memory

Three rules govern technology adoption: It elegant? Can it be deployed easily? Is it cheap?

Achieving elegance, ease, and affordability is key for rapid and widespread adoption. My experience in different industries and technology adoption cycles shows that you need two of the three to succeed. Many elegant architectures failed due to cost, deployment, and integration.

Mobile, wireless cellular networks are elegant and easy to deploy, but expensive. Wi-Fi is cheap and easy, but it struggles to match cellular service.

We shouldn’t assume that the best architecture will drive market adoption. Tech companies should aim for all three, but they can succeed with two, to maximise adoption quickly.

ANSYS FLUENT 2022.1 comparison of Release 19 R1 test case simulation ratings based on AMD internal testing as of 02/14/2022. The maximum is LG15. On 8-nodes, comb12, f1-140, race280, comb71, exh33, aw14, and lg15 scale super-linearly. 1-, 2-, 4-, and 8-node, 2x 64C AMD EPYC 7773X with AMD 3D V-Cache. 1, 2, 4, and 8-node, 2x 64C AMD EPYC 7763 with 8-node EPYC 7773X outperforms 7763 by ~52% on the Boeing Landing Gear 15M (lg15) test case.

Copyright 2023 Advanced Micro Devices, Inc. AMD Arrow, EPYC, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Ansys, Fluent, and all other Ansys, Inc. brand, product, service, and feature names, logos, and slogans are registered or licenced trademarks in the US or other countries.

PCI-SIG Corporation trademarks PCIe. Compute Express Link Consortium Inc. trademarks CXL. UCIe Consortium trademarks Universal Chiplet Interconnect Express and UCIe. Other product names in this publication are for identification only and may be trademarks.

News source

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Recent Posts

Popular Post

Govindhtech.com Would you like to receive notifications on latest updates? No Yes