Wednesday, July 17, 2024

C3 machine series A3 Confidential VMs with NVIDIA H100 GPUs

Workloads including artificial intelligence and machine learning are growing in popularity, thus it’s critical to protect them with specific data security procedures. In order to preserve the privacy of user prompts and AI/ML models during inference and to facilitate safe collaboration during model building, confidential computing can assist in protecting sensitive data used in ML training.

Google cloud unveiled two new Confidential Computing solutions at Google Cloud Next that are intended to safeguard the privacy and confidentiality of AI/ML workloads: Confidential VMs with support for Intel AMX and NVIDIA H100 Tensor Core GPUs with HGX protected PCIe.

A3 Secure Virtual Machines: Safeguarding Private AI/ML Tasks

NVIDIA and Google Cloud have partnered to enable Confidential Computing on GPUs. With a specific Confidential Computing mode, the H100 is NVIDIA’s first GPU to support Confidential Computing. Enabling strong hardware-based security in this mode guarantees that data and code within the Trusted Execution Environment (TEE) can only be executed by authorised users.

You will be able to unlock use cases involving highly restricted datasets, sensitive models that require extra protection, and the ability to collaborate with multiple untrusted parties and collaborators while mitigating infrastructure risks and strengthening isolation through confidential computing hardware with Confidential VMs with NVIDIA H100 Tensor Core GPUs with HGX protected PCIe.

Adjusting and instruction

You might want to make sure that your code and data are always protected when using AI models for training. Training involving input data with personally identifiable information (PII), proprietary data labelling, and trade secrets can be completed in a TEE with A3 Confidential VMs equipped with NVIDIA H100 GPUs. This ensures that any fine-tuning or additional AI/ML training is hidden from view outside the TEE.

To enhance customer service, a merchant would wish to develop a customised suggestion engine; nevertheless, this would necessitate training on consumer attributes and purchase history. The retailer can contribute to ensuring that customer data is safeguarded throughout by conducting training in a TEE.

Getting in touch

AI models and weights are frequently delicate intellectual property that requires robust security. The models run the risk of being manipulated, disclosing private client information, or even being reverse-engineered if they are not safeguarded while being used. By enabling data and models to be processed in a hardened state, A3 Confidential VMs with NVIDIA H100 GPUs can help safeguard models and inferencing requests and responses even from the model developers, if desired. This prevents unauthorized access or leaking of the sensitive model and requests.

This is especially important for people who use chatbots that are AI/ML based. When interacting with a chatbot that uses a natural language processing (NLP) model, users frequently provide private information. Because of data privacy laws, these user inquiries may need to be secured. The author of the model-based chatbot may give consumers further reassurances that their inputs are private if the chatbot is hosted on A3 Confidential Virtual Machines.

Working together

It is necessary for several organisations to train and make inferences on models while keeping their own models and constrained data private from one another. Organisations can collaborate in confidence by using NVIDIA H100 Tensor Core GPUs in Confidential VMs to enforce verifiable policies on data processing and outcome sharing.

In the healthcare sector, this use case is frequently seen when hospitals and medical organisations need to combine extremely private medical data sets or records in order to train models without disclosing the raw data from each party. Healthcare organisations can work together in a data clean room, such as Confidential Space, to guarantee security and performance, using A3 Confidential VMs with NVIDIA H100 GPUs.

A3 Confidential virtual machines (VMs) provide complete data protection by extending the Trusted Execution Environment (TEE) to include the integrated NVIDIA H100 GPUs. This is accomplished by using a privacy-centric strategy: a robust security border may be maintained by the CPU hardware preventing direct GPU access to the memory of the Confidential VM.

Confidential VMs with NVIDIA H100 Tensor Core GPUs
Image credit to Google cloud

An encrypted “bounce buffer” in shared system memory is used by the NVIDIA driver, which runs within the CPU TEE, to enable safe data transport. By acting as a middleman, this buffer mitigates potential in-band attacks by guaranteeing that all communication, including that between command buffers and CUDA kernels, is encrypted between the CPU and GPU. This architecture essentially establishes a secure data pipeline that protects integrity and secrecy even when sensitive data is processed on the potent NVIDIA H100 GPUs.

The finest aspect? Performance and usability are maintained by transparent encryption handled by the GPU firmware and CUDA driver.

Secret Virtual Machines Using Intel TDX and Intel AMX

According to Anand Pashupathy, vice president and general manager of Intel’s Security Software and Solutions Group, “AI will transform almost every industry, but it comes with critical security, privacy, and regulatory requirements.” “AI practitioners can safeguard their data and models and improve their compliance posture with Google Cloud’s new confidential AI offerings protected with Intel TDX, regardless of whether they are utilising CPU-based AI enhanced by Intel AMX instructions or AI accelerated by an external GPU.”

Intel TDX on C3 machine series

The security features that are available for private virtual machines are determined by the CPU of a machine series. For instance, the 4th generation Intel Xeon scalable processors (code called Sapphire Rapids) power the general purpose C3 machine series. Intel Trust Domain Extensions (Intel TDX), a Confidential Computing technology, is supported by these processors.

To prevent unwanted access to critical data and programmes, Intel TDX establishes a hardware-based trusted execution environment that places each guest virtual machine (VM) in a cryptographically segregated “trust domain.” Since March, the C3 machine series with Intel TDX has offered confidential virtual machines (VMs) in preview.

Nevertheless, the C3 CPU has an additional crucial function known as Intel Advanced Matrix Extensions (Intel AMX). The purpose of this new instruction set architecture (ISA) extension is to speed up AI/ML workloads. It adds additional instructions that can be utilised for two of the most popular AI/ML operations: matrix multiplication and convolution. Compared to instances from earlier generations, AI inference performance is improved by C3 instances equipped with Intel AMX.

All Confidential VMs on the C3 machine series feature Intel AMX instruction sets by default in order to provide a higher level of protection for AI/ML workloads. As a result, your AI/ML workloads can operate in secret and remain shielded from unauthorised access by cloud operators and privileged administrators. The Intel AMX is an integrated accelerator designed to enhance the efficiency of CPU-based training and inference. It is a cost-effective solution for tasks like image recognition, recommendation systems, and natural language processing. It is possible to lessen the chance of unapproved parties discovering AI/ML code or data by using Intel AMX on Confidential VMs.

Using Confidential Computing to further safeguard their sensitive workloads, Thales, a global leader in innovative technologies across three business domains defense and security, aerospace and space, and cybersecurity and digital identity has benefited from this arrangement.

“As more businesses migrate their data, there is an increasing need to protect data integrity and privacy, particularly with regard to intellectual property, AI models, sensitive workloads, and valuable information. and workloads to the cloud. Through this partnership, businesses can safeguard and manage their data while it’s in use, in transit, and at rest with completely verifiable attestation. Our strong partnership with Google Cloud and Intel boosts our clients’ confidence in their cloud migration,” stated Todd Moore, vice president of Thales’ data security products.

Give it a shot now

Intel AMX is turned on by default in Confidential VMs with Intel TDX on C3 machine series. To test out Intel AMX, just establish a Confidential virtual machine (VM) on the C3 machine series and execute your AI/ML workloads there.

Since June 2023, Drakshi has been writing articles of Artificial Intelligence for govindhtech. She was a postgraduate in business administration. She was an enthusiast of Artificial Intelligence.


Please enter your comment!
Please enter your name here

Recent Posts

Popular Post Would you like to receive notifications on latest updates? No Yes