Saturday, July 20, 2024

NVIDIA Launches Generative AI Microservices for developers

Nvidia AI Microservices architecture

In order to enable companies to develop and implement unique applications on their own platforms while maintaining complete ownership and control over their intellectual property, NVIDIA released hundreds of enterprise-grade generative AI microservices.

The portfolio of cloud-native microservices, which is built on top of the NVIDIA CUDA platform, includes NVIDIA NIM microservices for efficient inference on over two dozen well-known AI models from NVIDIA and its partner ecosystem. Additionally, NVIDIA CUDA-X microservices for guardrails, data processing, HPC, retrieval-augmented generation (RAG), and other applications are now accessible as NVIDIA accelerated software development kits, libraries, and tools. Additionally, approximately two dozen healthcare NIM and CUDA-X microservices were independently revealed by NVIDIA.

NVIDIA’s full-stack computing platform gains a new dimension with the carefully chosen microservices option. With a standardized method to execute bespoke AI models designed for NVIDIA’s CUDA installed base of hundreds of millions of GPUs spanning clouds, data centers, workstations, and PCs, this layer unites the AI ecosystem of model creators, platform providers, and organizations.

Prominent suppliers of application, data, and cybersecurity platforms, such as Adobe, Cadence, CrowdStrike, Getty Images, SAP, ServiceNow, and Shutterstock, were among the first to use the new NVIDIA generative AI microservices offered in NVIDIA AI Enterprise 5.0.

Jensen Huang, NVIDIA founder and CEO, said corporate systems have a treasure of data that can be turned into generative AI copilots. These containerized AI microservices, created with their partner ecosystem, enable firms in any sector to become AI companies.

Microservices for NIM Inference Accelerate Deployments From Weeks to Minutes

NIM microservices allow developers to cut down on deployment timeframes from weeks to minutes by offering pre-built containers that are driven by NVIDIA inference tools, such as TensorRT-LLM and Triton Inference Server.

For fields like language, voice, and medication discovery, they provide industry-standard APIs that let developers easily create AI apps utilizing their private data, which is safely stored in their own infrastructure. With the flexibility and speed to run generative AI in production on NVIDIA-accelerated computing systems, these applications can expand on demand.

For deploying models from NVIDIA, A121, Adept, Cohere, Getty Images, and Shutterstock as well as open models from Google, Hugging Face, Meta, Microsoft, Mistral AI, and Stability AI, NIM microservices provide the quickest and most efficient production AI container.

Today, ServiceNow revealed that it is using NIM to create and implement new generative AI applications, such as domain-specific copilots, more quickly and affordably.

Consumers will be able to link NIM microservices with well-known AI frameworks like Deepset, LangChain, and LlamaIndex, and access them via Amazon SageMaker, Google Kubernetes Engine, and Microsoft Azure AI.

Guardrails, HPC, Data Processing, RAG, and CUDA-X Microservices

To accelerate production AI development across sectors, CUDA-X microservices provide end-to-end building pieces for data preparation, customisation, and training.

Businesses may utilize CUDA-X microservices, such as NVIDIA Earth-2 for high resolution weather and climate simulations, NVIDIA cuOpt for route optimization, and NVIDIA Riva for configurable speech and translation AI, to speed up the adoption of AI.

NeMo Retriever microservices enable developers to create highly accurate, contextually relevant replies by connecting their AI apps to their business data, which includes text, photos, and visualizations like pie charts, bar graphs, and line plots. Businesses may improve accuracy and insight by providing copilots, chatbots, and generative AI productivity tools with more data thanks to these RAG capabilities.

Nvidia nemo

There will soon be more NVIDIA NeMo microservices available for the creation of bespoke models. These include NVIDIA NeMo Evaluator, which analyzes AI model performance, NVIDIA NeMo Guardrails for LLMs, NVIDIA NeMo Customizer, which builds clean datasets for training and retrieval, and NVIDIA NeMo Evaluator.

Ecosystem Uses Generative AI Microservices To Boost Enterprise Platforms

Leading application suppliers are collaborating with NVIDIA microservices to provide generative AI to businesses, as are data, compute, and infrastructure platform vendors from around the NVIDIA ecosystem.

NVIDIA microservices is collaborating with leading data platform providers including Box, Cloudera, Cohesity, Datastax, Dropbox, and NetApp to assist users in streamlining their RAG pipelines and incorporating their unique data into generative AI applications. NeMo Retriever is a tool that Snowflake uses to collect corporate data in order to create AI applications.

Businesses may use the NVIDIA microservices that come with NVIDIA AI Enterprise 5.0 on any kind of infrastructure, including popular clouds like Google Cloud, Amazon Web Services (AWS), Azure, and Oracle Cloud Infrastructure.

More than 400 NVIDIA-Certified Systems, including as workstations and servers from Cisco, Dell Technologies, HP, Lenovo, and Supermicro, are also capable of supporting NVIDIA microservices. HPE also announced today that their enterprise computing solution for generative AI is now available. NIM and NVIDIA AI Foundation models will be integrated into HPE’s AI software.

VMware Private AI Foundation and other infrastructure software platforms will soon support NVIDIA AI Enterprise microservices. In order to make it easier for businesses to incorporate generative AI capabilities into their applications while maintaining optimal security, compliance, and control capabilities, Red Hat OpenShift supports NVIDIA NIM microservices. With NVIDIA AI Enterprise, Canonical is extending Charmed Kubernetes support for NVIDIA microservices.

Through NVIDIA AI Enterprise, the hundreds of AI and MLOps partners that make up NVIDIA’s ecosystem such as Abridge, Anyscale, Dataiku, DataRobot, Glean,, Securiti AI,, OctoAI, and Weights & Biases are extending support for NVIDIA microservices.

Vector search providers like as Apache Lucene, Datastax, Faiss, Kinetica, Milvus, Redis, and Weaviate are collaborating with NVIDIA NeMo Retriever microservices to provide responsive RAG capabilities for businesses.


NVIDIA microservices are available for free experimentation by developers at Businesses may use NVIDIA AI Enterprise 5.0 on NVIDIA-Certified Systems and top cloud platforms to deploy production-grade NIM microservices.

Since June 2023, Drakshi has been writing articles of Artificial Intelligence for govindhtech. She was a postgraduate in business administration. She was an enthusiast of Artificial Intelligence.

Recent Posts

Popular Post Would you like to receive notifications on latest updates? No Yes