Guardrails in Agumbe AI Gateway help teams control how AI traffic behaves before it reaches a model and after a model returns a response. They are designed to give customers a practical, app-level policy system for real production workloads. Instead of pushing safety, usage, and model controls into every individual application, Agumbe lets you define those controls once and enforce them consistently through the gateway. For most teams, guardrails are one of the main reasons to use a gateway in the first place. They turn AI access from a raw model integration into a governed platform capability.Documentation Index
Fetch the complete documentation index at: https://agumbe.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
What guardrails do
A guardrail policy tells the gateway how to inspect, modify, or block requests and responses for a specific app. Depending on how a policy is configured, the gateway can:- detect risky or disallowed content
- redact sensitive content before it reaches a model
- block a request entirely
- inspect model output before returning it
- restrict which models an app is allowed to use
- cap token usage
- apply app-level request rate limits
Why guardrails matter
AI applications often need more than authentication and model access. They also need rules. For example, a team may want to:- prevent prompt injection attempts from passing through unchanged
- redact personally identifiable information before it reaches a model
- block secrets or credentials from being exposed in prompts or outputs
- restrict an app to a small set of approved models
- keep responses grounded in supplied context
- limit output size and request rate for a specific workload
Guardrails are app-level
In Agumbe, guardrails are stored and enforced at the app level. That means each app can have its own policy based on its purpose, sensitivity, risk level, and operational needs. This is important because not all AI workloads are the same. A support workflow, an internal knowledge assistant, and a marketing content tool usually should not share the exact same rules. Agumbe lets you keep those policies separate while still using one common gateway.How a guardrail policy is selected
The gateway determines which guardrail policy to apply based on the app context of the request. There are two common patterns:App-scoped API key
If the request is made with an app-scoped API key, the gateway automatically uses that app’s guardrail policy. This is the simplest and safest production pattern when one workload should always use one app policy.Tenant-scoped API key
If the request is made with a tenant-scoped API key, the caller can select the app policy for that request by sending:agumbe_guardrails_app_id
Example:
When guardrails are applied
Guardrails are applied during request execution in three stages.1. Request stage
At the request stage, the gateway can enforce controls such as:- allowed model restrictions
- token caps
- app-level rate limits
2. Input stage
At the input stage, the gateway can inspect prompt content or embeddings input and decide whether to detect, redact, or block content based on the policy. This is where controls such as prompt injection detection, PII handling, secrets handling, and denied topic checks are applied to incoming data.3. Output stage
At the output stage, the gateway can inspect the model response before returning it to the caller. This is where controls such as output safety, PII checks, secrets checks, denied topics, and groundedness evaluation are applied to generated text.Supported guardrail controls
Agumbe currently supports the following guardrail policy fields.Prompt injection
Use this to inspect direct prompt injection attempts in request content. This helps identify or block content intended to override instructions, reveal hidden prompts, bypass safety, or manipulate model behavior. Typical use cases:- customer-facing assistants
- internal copilots
- retrieval-augmented generation pipelines
- agents that read user-provided text
Indirect prompt injection
Use this to inspect content that may contain embedded or retrieved instructions designed to hijack model behavior. This is especially relevant when your system works with documents, knowledge bases, retrieved content, or long-form user inputs. Typical use cases:- document Q&A
- retrieval systems
- knowledge assistants
- agentic workflows
PII
Use this to detect, redact, or block personally identifiable information in prompts and outputs. This helps reduce the chance of sending sensitive user data to models unnecessarily and helps prevent sensitive output from being returned to callers. Typical examples include:- email addresses
- phone numbers
- payment card numbers
- other personal identifiers
Secrets
Use this to detect, redact, or block credentials and secret material. This is useful for preventing accidental leakage of tokens, keys, private credentials, or secret-bearing payloads. Typical examples include:- API keys
- private keys
- access tokens
- JWTs
- cloud credentials
Denied topics
Use this to detect or block requests and outputs that relate to topics your app should not handle. This gives teams a simple way to define domain-level exclusions for an app. Examples might include:- legal advice
- medical diagnosis
- self-harm content
- prohibited operational topics
- restricted internal content categories
Output safety
Use this to inspect generated output for unsafe or disallowed content patterns. This helps prevent the gateway from returning harmful generated text to the caller. Typical examples include:- instructions for harmful activity
- credential theft guidance
- unsafe exploit-oriented output
- phishing or malicious operational guidance
Groundedness
Use this to check whether a generated answer stays anchored to context supplied by the caller. This is especially useful for retrieval-augmented systems where the response should remain tied to known source material. The gateway can useagumbe_grounding_context from the request to evaluate whether the generated response stays consistent with the provided grounding context.
Allowed models
Use this to restrict which models an app is allowed to use. This is useful when a team wants to:- standardize approved models
- limit usage to tested models only
- avoid high-cost or unapproved models
- separate development and production model access
Max tokens
Use this to cap token usage for a specific app. This helps teams control output size and reduce cost or misuse risk. If a request asks for more tokens than the app policy allows, the gateway caps the request to the configured maximum.Rate limit per minute
Use this to define an app-level request rate limit. This gives teams a simple way to constrain how much traffic a workload can send in a short window.Guardrail modes
Most content-oriented guardrails support four policy modes:- off
- detect
- redact
- block
Off
The gateway does not apply that control. Use this when the guardrail is not relevant to the app.Detect
The gateway records that a policy match occurred but allows the content to continue unchanged. Use this when you want visibility first, before you start enforcing stronger actions. This is a good starting point for new teams or new workloads.Redact
The gateway replaces matched content before continuing. Use this when you want to allow the request or response to proceed, but you do not want specific sensitive content to pass through unchanged. This is often a strong default for PII and secrets.Block
The gateway rejects the request or response when a match occurs. Use this when the content should never be allowed for the app. This is appropriate for high-risk workflows or clearly disallowed content classes.Example guardrail policy
Here is a representative policy payload:Example: request with app selection
Example: request with grounding context
Guardrail traces in responses
When a guardrail policy is applied, the gateway can attach anagumbe_guardrails object to the response.
This gives the caller structured visibility into what happened during policy enforcement.
A trace can include:
- whether a guardrail policy was applied
- whether the subject was a session or app credential
- which app policy was used
- which decisions were made
- whether content was detected, redacted, blocked, capped, or rate-limited
Example response fragment
How guardrails affect production behavior
Guardrails shape both safety and operational behavior. That means they should be treated as a product and platform configuration concern, not only a model concern. For example, the same model may behave very differently depending on:- whether prompt injection is detected or blocked
- whether secrets are redacted
- whether the app is limited to approved models
- whether groundedness is enforced
- whether max tokens are capped aggressively
- whether request rate limits are strict or permissive
Recommended rollout strategy
For most teams, the best way to adopt guardrails is gradually.Start with visibility
Begin with detect for controls such as:- prompt injection
- indirect prompt injection
- denied topics
- output safety
- groundedness
Redact sensitive data early
For pii and secrets, redact is often a strong practical default. This reduces risk while still allowing useful application traffic to continue.Tighten model controls
Use allowedModels, maxTokens, and rateLimitPerMinute early in production. These controls are often low-friction and high-value.Move to blocking where appropriate
Once you understand real traffic patterns, move high-risk controls to block where needed. This is especially helpful for regulated, customer-facing, or high-sensitivity workloads.Best practices
Keep policies app-specific
Do not try to make one policy fit every workload. Different apps have different risk profiles and operational needs.Use app-scoped keys when policies should never vary
If a workload should always use one guardrail policy, app-scoped API keys reduce ambiguity and improve safety.Start simple
You do not need a very large policy on day one. A focused policy with PII, secrets, allowed models, token caps, and basic prompt injection detection is often enough to start well.Review traces and logs
Use request logs and guardrail traces to understand how policies are behaving in real traffic.Pair guardrails with model strategy
Guardrails are strongest when combined with sensible aliases, approved model lists, and production environment separation.Common errors
Guardrail enforcement can produce structured errors when the gateway blocks a request. Examples include:guardrail_blockedguardrail_model_blockedguardrail_rate_limit_exceeded
Recommended starting point
If you are setting up guardrails for the first time, start with this approach:- promptInjection: detect
- indirectPromptInjection: detect
- pii: redact
- secrets: redact
- deniedTopics: detect
- outputSafety: detect
- groundedness: detect
- a short allowedModels list
- a sensible maxTokens cap
- a reasonable rateLimitPerMinute
Next steps
Once guardrails are in place, the next pages to read are:- Routing and Reliability to understand how requests are executed after policy checks
- Request Logging and Observability to see how policy decisions and request metadata appear operationally
- Go to Production for a broader production-readiness checklist