Use mathematically sound Automated Reasoning checks to avoid factual errors from LLM hallucinations (preview)
To help you quantitatively verify the accuracy of responses produced by large language models (LLMs) and avoid factual errors from hallucinations, AWS is introducing Automated Reasoning checks (preview) as a new security feature in Amazon Bedrock Guardrails today.
By screening unwanted content, removing personally identifying information (PII), and improving content safety and privacy, Amazon Bedrock Guardrails enables you to protect generative AI applications. Policies can be set up for contextual grounding checks, PII redaction, word filters, content filters, denied topics, and now Automated Reasoning checks.
To ensure that the information produced by a model is compatible with known facts and not based on erroneous or inconsistent data, automated reasoning checks use sound mathematical, logic-based algorithmic verification and reasoning methods to help prevent factual errors from hallucinations.
The only responsible AI feature provided by a large cloud provider that enables users to create and modify safety, privacy, and truthfulness for their generative AI applications in a single solution is Amazon Bedrock Guardrails.

Automated Reasoning checks
An overview
A branch of computer science called “automated reasoning” uses logical inference and mathematical proofs to confirm how programs and systems behave. In contrast to machine learning (ML), which produces predictions, it offers mathematical assurances regarding the behaviour of a system. It is already used by Amazon Web Services (AWS) in important service domains like identity, virtualization, storage, networking, and cryptography. For instance, automated reasoning speeds up development and improves efficiency by formally confirming that cryptographic implementations are accurate.
AWS is now similarly using generative AI. The first and only generative AI safeguard that uses logically sound and verifiable reasoning to explain why generative AI replies are valid to help prevent factual errors caused by hallucinations is the new Automated Reasoning checks (preview) in Amazon Bedrock Guardrails. These tests are especially helpful in use situations where explainability and factual accuracy are crucial. Automated Reasoning checks, for instance, might be used to verify LLM-generated answers regarding operational workflows, firm product details, or human resources (HR) policies.
When combined with additional methods like contextual grounding checks, rapid engineering, and retrieval-augmented generation (RAG), Automated Reasoning checks provide a more thorough and verified method of ensuring that LLM-generated output is factually correct. You can be sure that your conversational AI applications are giving your consumers accurate and dependable information by embedding your domain knowledge into structured policies.
Using Amazon Bedrock Guardrails’ Automated Reasoning checks (preview)
You can develop Automated Reasoning policies that encode the rules, procedures, and guidelines of your organization in a mathematically structured style by using the Automated Reasoning checks in Amazon Bedrock Guardrails. You can then utilize these policies to confirm that the material produced by your LLM-powered applications complies with your policies.
A collection of variables, each having a name, kind, and description, as well as the logical rules that apply to the variables, make up Automated Reasoning policies. Formal logic is used to explain rules behind the scenes, but they are translated into natural language so that a user without formal logic knowledge can more easily refine a model. When verifying a Q&A, Automated Reasoning tests extract the values of the variables from their descriptions.
How it works
Establish rules for automated reasoning
You can submit documents that outline the policies and practices of your company using the Amazon Bedrock platform. After analysing these texts, Amazon Bedrock will automatically generate a first Automated Reasoning policy that mathematically illustrates the main ideas and their connections.
Go to Safeguards and select the new Automated Reasoning menu option. Come up with a new policy and name it. Upload an existing document, such an operational handbook or HR guidelines, that specifies the appropriate solution space.
Next, specify the purpose of the policy and any processing parameters. For instance, indicate whether it will verify questions from airport employees and note those components like internal reference numbers that should not be processed. To aid the system in comprehending common interactions, include one or more sample Q&As.
Next, select Create.
Your Automated Reasoning policy is now created automatically by the system. This procedure entails examining your work, determining its main ideas, segmenting it into discrete sections, converting these natural language sections into formal logic, confirming the translations, and then integrating them into an all-encompassing logical model. Examine the created structure after it is finished, taking note of the variables and rules. Through the user interface, you can make accurate edits to these.
You must first construct a guardrail before you can test the Automated Reasoning policy.
Establish automated reasoning checks and a guardrail
You can choose which Automated Reasoning policies to employ for validation and enable Automated Reasoning checks while developing your conversational AI application with Amazon Bedrock Guardrails.
In Safeguards, select the Guardrails menu item. Make a brand-new guardrail and name it. Select the policy and version you wish to employ, then choose Enable Automated Reasoning. Next, finish configuring your guardrail.
Test Automated Reasoning checks
To check if your Automated Reasoning policy is working, utilize the Test playground in the Automated Reasoning console. To validate, enter a test question in the same way that a user would interact with your application, along with an example response.
The material will be examined and verified by Automated Reasoning checks against the Automated Reasoning policies that you have set up. Any factual errors or discrepancies will be found by the checks, which will also explain the validation’s findings.
The recommendations display a list of variable assignments that would validate the conclusion in the event that the validation result is invalid.
Suggestions display a list of assignments required for the result to hold if no factual errors are found and the validation result is correct; these are implicit assumptions in the response.
The console will show Mixed findings as the validation result if factual contradictions are found. A list of discoveries, some tagged as legitimate and others as invalid, can be found in the API response. If this occurs, revise any ambiguous policy rules and examine the system’s conclusions and recommendations.
The validation results can also be used to improve LLM-generated responses in response to feedback.
The method of achieving high validation accuracy is iterative. Reviewing policy performance on a frequent basis and making necessary adjustments is best practice. The system will automatically update the logical model when you make changes to the rules in natural language.
For instance, revising the descriptions of variables can greatly increase the accuracy of validation. Imagine a situation where the is_full_time variable’s description merely says, “works more than 20 hours per week,” and the query asks, “I’m a full-time employee.” In this instance, the term “full-time” may not be recognised by Automated Reasoning tests.
You should change the variable description to be more thorough in order to improve accuracy. For example, “Works more than 20 hours per week.” People may call this part-time or full-time. For full-time, the value should be true; for part-time, it should be false. This thorough explanation makes it easier for the system to identify all pertinent factual assertions for verification in natural language queries and responses, leading to more precise outcomes.
Accessible in preview
Today, Amazon Bedrock Guardrails in the US West (Oregon) AWS Region is previewing the new Automated Reasoning checks safeguard. Contact your AWS account team to ask to be given access to the preview right now. Check the Amazon Bedrock console for a sign-up form in the coming weeks.