LLMOps should employ acceptable AI tools and processes
LLMOps must address the problems and risks of generative AI as improve it. Data security and privacy, low-quality or ungrounded outputs, misuse and overreliance on AI, hazardous content, and AI systems vulnerable to adversarial attacks like jailbreaks are common problems. Building a generative AI application requires identifying, measuring, mitigating, and monitoring these risks.
Some of the obstacles of constructing generative AI applications are typical software challenges that apply to many applications. Role-based access (RBAC), network isolation and monitoring, data encryption, and application monitoring and logging are security best practices.
Microsoft provides many tools and controls to help IT and development teams solve these predictable concerns. This blog will discuss the probabilistic AI issues of constructing generative AI applications.
First, implementing responsible AI principles like transparency and safety in a production application is difficult. Without pre-built tools and controls, few firms have the research, policy, and engineering resources to operationalize responsible AI. Microsoft takes the greatest cutting-edge research ideas, considers policy and consumer feedback, and produces and integrates practical, responsible AI tools and techniques directly into AI portfolio.
This post covers Azure AI Studio’s model catalog, quick flow, and Azure AI Content Safety. To help developers deploy responsible AI in their businesses, document and share her learnings and best practices.
Mitigation and evaluation mapping to LLMOps livecycle
Generative AI models have risks that must be mitigated by iterative, layered testing and monitoring. Typically, production applications have four technological mitigation layers: LLMOps,model, safety system, metaprompt and grounding, and user experience. Platform layers like the model and safety system include built-in mitigations for many applications.
Application purpose and design determine the next two layers, therefore mitigations might differ greatly. They will compare these mitigation layers to massive language model operations below.
Loop of ideas and exploration: Add model layer and safety system safeguards
One developer explores and evaluates models in a model library to determine whether they meet their use case in the first iteration loop of LLMOps. Responsible AI requires understanding each model’s harm-related capabilities and limitations. Developers can stress-test the model using model cards from the model developer and work data and prompts.
The Azure AI model library includes models from OpenAI, Meta, Hugging Face, Cohere, NVIDIA, and Azure OpenAI Service, classified by collection and task. Model cards describe in depth and allow sample inferences or custom data testing. Some model suppliers fine-tune safety mitigations within their models.
Model cards describe these mitigations and allow sample inferences or custom data testing. Microsoft Ignite 2023 also saw the launch of Azure AI Studio’s model benchmark function, which helps compare library models’ performance.
A safety system
Most applications require more than model-based safety fine-tuning. Big language models can make mistakes and be jailbroken. Azure AI Content Safety, another AI-based safety technology, blocks hazardous content in many Microsoft applications. LLMOps,Customers like South Australia’s Department of Education and Shell show how Azure AI Content Safety protects classroom and chatroom users.
This safety runs your model’s prompt and completion through categorization models that detect and prohibit harmful content across hate, sexual, violent, and self-harm categories and adjustable severity levels (safe, low, medium, and high).
Azure AI Content Safety jailbreak risk and protected material detection public previews were announced at Ignite. Azure AI Content Safety can be used to deploy your model using the Azure AI Studio model catalog or big language model apps to an endpoint.
Expand loop with metaprompt and grounding mitigations
After identifying and evaluating their desired large language model’s essential features, developers go on to guiding and improving it to fit their needs. This is where companies can differentiate their apps.
Metaprompt and foundation
Every generative AI application needs grounding and metaprompt design. Rooting your model in relevant context, or retrieval augmented generation (RAG), can greatly increase model accuracy and relevance. Azure AI Studio lets you easily and securely ground models on structured, unstructured, and real-time data, including Microsoft Fabric data.
Building a metaprompt follows getting the correct data into your application. An AI system follows natural language metaprompts (do this, not that). A metaprompt should let a model use grounded data and enforce rules to prevent dangerous content production or user manipulations like jailbreaks or prompt injections.
They prompt engineering guidance and metaprompt templates are updated with industry and Microsoft research best practices to help you get started. Siemens, Gunnebo, and PwC make custom Azure experiences utilizing generative AI and their own data.
Best-practice mitigations aren’t enough. Testing them before releasing your application in production will ensure they perform properly. Pre-built or custom evaluation processes allow developers to evaluate their apps using performance measures like accuracy and safety metrics like groundedness. A developer can even design and evaluate metaprompt alternatives to see which produces higher-quality results aligned with corporate goals and ethical AI standards.
Operating loop: Add monitoring and UX design safeguards
The third loop depicts development-to-production. This loop focuses on deployment, monitoring, and CI/CD integration. It also demands UX design team collaboration to enable safe and responsible human-AI interactions.
The user experience
This layer focuses on end-user interaction with massive language model applications. Your interface should help consumers comprehend and apply AI technologies while avoiding dangers. The HAX Toolkit and Azure AI documents include best practices for reinforcing user responsibility, highlighting AI’s limitations to avoid overreliance, and ensuring users are using AI appropriately.
Watch your app
Continuous model monitoring is a key LLMOps step to keep AI systems current with changing social behaviors and data. Azure AI provides strong capabilities to monitor production application safety and quality. Build your own metrics or monitor groundedness, relevance, coherence, fluency, and similarity rapidly.
Outlook for Azure AI
Microsoft’s inclusion of responsible AI tools and practices in LLMOps proves that technology innovation and governance are mutually reinforcing.
Azure AI leverages Microsoft’s years of AI policy, research, and engineering knowledge to help your teams build safe, secure, and reliable AI solutions from the start and apply corporate controls for data privacy, compliance, and security on AI-scale infrastructure. Look forward to developing for her customers to assist every organization experience the short- and long-term benefits of trust-based applications.