Platform Overview

Agumbe AI Gateway is the infrastructure layer that helps teams build, run, and govern AI-powered applications from one place avoiding AI breach. It gives your application a single gateway for model access, policy enforcement, usage tracking, and operational visibility, while the Agumbe Console gives your team a control plane to manage how that traffic behaves. At a high level, Agumbe separates the platform into two parts: the data plane and the control plane. The data plane is the live gateway path. This is where your application sends chat and embeddings requests. The gateway authenticates the request, resolves the selected model or alias, applies the right app-level guardrails, routes the request to the configured provider adapter, normalizes the response, records usage, and emits request metadata for observability and billing. The control plane is the Agumbe Console and supporting platform services. This is where your team creates API keys, tests prompts in the playground, configures guardrails, reviews request activity, tracks wallet health, and manages operational settings. Together, these layers let teams move from direct model-provider calls to a managed AI platform without making application code more complicated.

How the platform works

A typical request flows through Agumbe in five steps.

Your application sends a request

Your backend calls Agumbe AI Gateway using a Gateway API key. The request can be a chat completion or an embeddings request.

The gateway authenticates the caller

The gateway validates the API key or session and identifies the tenant, user, app, and key scope. This allows Agumbe to apply the right access rules and tenant-level controls.

The gateway resolves the model

Your request can use a stable Agumbe alias, such as smart-default or reasoning, or a catalog-backed model ID. The gateway resolves that name to the right provider and upstream model.

The gateway applies policy

Agumbe loads the guardrail policy for the selected app. Depending on your configuration, the gateway can detect, redact, block, or enforce rules for prompt injection, PII, secrets, denied topics, output safety, groundedness, model allowlists, max tokens, and rate limits.

The gateway routes, logs, and returns the response

The gateway sends the request to the selected provider adapter, normalizes the response, records latency and usage metadata, estimates cost when pricing is configured, and returns the response to your application.

Core components

Agumbe AI Gateway The gateway is the runtime service for LLM traffic. It exposes APIs for chat completions, embeddings, models, guardrails, and request logs. It is responsible for authentication, model resolution, routing, guardrail enforcement, request logging, usage emission, and reliability controls. Agumbe Console The console is the control plane for developers and operators. Teams use it to create and manage credentials, test requests in the playground, configure app-level guardrails, inspect recent requests, and track wallet or billing status. Authentication Agumbe supports authenticated access through Gateway API keys and console sessions. For production applications, use Gateway API keys from your backend or service layer. API keys can be tenant-scoped or app-scoped, depending on how you want guardrail policies to be selected. Model catalog and aliases The model catalog lists the models and aliases available through the gateway. Aliases give teams stable names for application code, while the gateway keeps the actual routing decision behind the platform. This makes it easier to change providers, models, fallbacks, or routing rules without rewriting every application. Guardrails Guardrails are app-level policies that help teams control how AI traffic behaves. Policies can apply to both request and response traffic. Teams can use guardrails to manage prompt injection risk, sensitive data, unsafe output, denied topics, groundedness, allowed models, token limits, and rate limits. Request logs and observability Agumbe records request metadata such as request ID, tenant, app, request type, model, provider, status, latency, token usage, estimated cost, and error code. This gives teams a practical view of how AI traffic is behaving in production. Usage and billing The gateway emits usage events and records token and request metadata. Supporting tenant and payment services power wallet and billing visibility in the console. This helps teams understand consumption, monitor spend, and operate AI workloads with clearer financial controls.

Why this matters

Without a gateway, every application tends to manage model access, provider integration, logging, policy, and cost tracking on its own. That becomes difficult to operate as AI usage grows across teams and products. Agumbe AI Gateway gives teams one place to manage these concerns. Application code stays simple: call the gateway, choose a model or alias, and receive a normalized response. Platform teams still get the control they need: policies, observability, routing, usage tracking, and a console for day-to-day operations. Use Agumbe AI Gateway when you want a single, governed path for AI traffic across applications, teams, models, and providers.

Start here

Product

Cookbooks

How the platform works

Core components

Why this matters

Start here

Product

Cookbooks

Documentation Index

​How the platform works

​Core components

​Why this matters

How the platform works

Core components

Why this matters