Monday, May 20, 2024

Learning Azure’s GPU Future Strategy

Azure’s GPU future strategy is private

They innovate to improve security at Microsoft Azure. Their collaboration with hardware partners to create a silicon-based foundation that protects memory data using confidential computing is a pioneering effort.

Data is created, computed, stored, and moved. Customers already encrypt their data at rest and in transit. They haven’t had the means to protect their data at scale. Confidential computing is the missing third stage in protecting data in hardware-based trusted execution environments (TEEs) that secure data throughout its lifecycle.

Microsoft co-founded the Confidential Computing Consortium (CCC) in September 2019 to protect data Azure’s GPU in use with hardware-based TEEs. These TEEs always protect data by preventing unauthorized access or modification of applications and data during computation. TEEs guarantee data integrity, confidentiality, and code integrity. Attestation and a hardware-based root of trust prove the system’s integrity and prevent administrators, operators, and hackers from accessing it.

For workloads that want extra security in the cloud, confidential computing is a foundational defense in depth capability. Verifiable cloud computing, secure multi-party computation, and data analytics on sensitive data sets can be enabled by confidential computing.

Confidentiality has recently become available for CPUs, but Azure’s GPU based scenarios that require high-performance computing and parallel processing, such as 3D graphics and visualization, scientific simulation and modeling, and AI and machine learning, have also required it. Confidential computing is possible for GPU scenarios processing sensitive data and code in the cloud, including healthcare, finance, government, and education.

Azure has worked with NVIDIA for years to implement GPU confidentiality. This is why previewed Azure confidential VMs with NVIDIA H100-PCIe Tensor Core GPUs at Microsoft Ignite 2023. The growing number of Azure confidential computing (ACC) services and these Virtual Machines will enable more public cloud innovations that use sensitive and restricted data.

GPU confidential computing unlocks use cases with highly restricted datasets and model protection. Scientific simulation and modeling can use confidential computing to run simulations and models on sensitive data like genomic, climate, and nuclear data without exposing the data or code (including model weights) to unauthorized parties. Azure’s GPU This can help scientists collaborate and innovate while protecting data.

Medical image analysis may use confidential computing for image generation. Confidential computing allows healthcare professionals to analyze medical images like X-rays, CT scans, and MRI scans using advanced image processing methods like deep learning without exposing patient data or proprietary algorithms. Keeping data private and secure can improve diagnosis and treatment accuracy and efficiency. Confidential computing can detect medical image tumors, fractures, and anomalies.

Given AI’s massive potential, confidential AI refers to a set of hardware-based technologies that provide cryptographically verifiable protection of data and models throughout their lifecycle, including use. Confidential AI covers AI lifecycle scenarios.

Inference confidentiality. Protects model IP and inferencing requests and responses from model developers, service operations, and cloud providers.

Private multi-party computation. Without sharing models or data, organizations can train and run inferences on models and enforce policies on how outcomes are shared.

Training confidentiality. Model builders can hide model weights and intermediate data like checkpoints and gradient updates exchanged between nodes during training with confidential training. Confidential AI can encrypt data and models to protect sensitive information during AI inference.

Computing components that are private

A robust platform with confidential computing capabilities is needed to meet global data security and privacy demands. It uses innovative hardware and Virtual Machines and containers for core infrastructure service layers. This is essential for services to switch to confidential AI. These building blocks will enable a confidential GPU ecosystem of applications and AI models in the coming years.

Secret Virtual Machines

Confidential Virtual Machines encrypt data in use, keeping sensitive data safe while being processed. Azure was the first major cloud to offer confidential Virtual Machines powered by AMD SEV-SNP CPUs with memory encryption that protects data while processing and meets the Confidential Computing Consortium (CCC) standard.

In the DCe and ECe virtual machines, Intel TDX-powered Confidential Virtual Machines protect data in use. These virtual machines use 4th Gen Intel Xeon Scalable processors to boost performance and enable seamless application onboarding without code changes.

Azure offers confidential virtual machines, which are extended by confidential GPUs. Azure is the sole provider of confidential virtual machines with 4th Gen AMD EPYC processors, SEV-SNP technology, and NVIDIA H100 Tensor Core GPUs in our NCC H100 v5 series.Azure’s GPU Data is protected during processing due to the CPU and GPU’s encrypted and verifiable connection and memory protection mechanisms. This keeps data safe during processing and only visible as cipher text outside CPU and GPU memory.

Containers with secrets

Containers are essential for confidential AI scenarios because they are modular, accelerate development/deployment, and reduce virtualization overhead, making AI/machine learning workloads easier to deploy and manage.

Azure innovated CPU-based confidential containers:

Serverless confidential containers in Azure Container Instances reduce infrastructure management for organizations. Serverless containers manage infrastructure for organizations, lowering the entry barrier for burstable CPU-based AI workloads and protecting data privacy with container group-level isolation and AMD SEV-SNP-encrypted memory.

Azure now offers confidential containers in Azure Kubernetes Service (AKS) to meet customer needs. Organizations can use pod-level isolation and security policies to protect their container workloads and benefit from Kubernetes’ cloud-native standards. Our hardware partners AMD, Intel, and now NVIDIA have invested in the open-source Kata Confidential Containers project, a growing community.

These innovations must eventually be applied to GPU-based confidential AI.

Road ahead

Hardware innovations mature and replace infrastructure over time. They aim to seamlessly integrate confidential computing across Azure, including all virtual machine SKUs and container services. This includes data-in-use protection for confidential GPU workloads in more data and AI services.

Pervasive memory encryption across Azure’s infrastructure will enable organizations to verify cloud data protection throughout the data lifecycle eventually making confidential computing the norm.

agarapuramesh
agarapurameshhttps://govindhtech.com
Agarapu Ramesh was founder of the Govindhtech and Computer Hardware enthusiast. He interested in writing Technews articles. Working as an Editor of Govindhtech for one Year and previously working as a Computer Assembling Technician in G Traders from 2018 in India. His Education Qualification MSc.
RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Recent Posts

Popular Post

Govindhtech.com Would you like to receive notifications on latest updates? No Yes