Agumbe AI Gateway gives you a stable way to work with language models without forcing your application to depend on provider-specific naming or routing behavior. When your application sends a request to the gateway, it includes a model value. That value can be either:Documentation Index
Fetch the complete documentation index at: https://agumbe.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
- an Agumbe alias
- a catalog-backed model ID
- stable application-facing names for day-to-day development
- explicit model targeting when you need direct control
Why this matters
Without a gateway, model names tend to leak directly into application code. Over time, that creates friction:- changing models requires code changes across services
- production traffic becomes tightly coupled to one provider’s naming scheme
- experimenting with fallbacks or routing becomes harder
- business teams and platform teams lose a shared vocabulary for “default” or “reasoning” behavior
Two ways to select a model
Use an alias
An alias is a stable, gateway-defined model name. Aliases are ideal when you want:- a clean integration experience
- stable application code
- room to evolve routing later
- a platform-controlled default for quality, speed, or reasoning
Use a catalog-backed model ID
A catalog-backed model ID targets a specific model entry exposed by the gateway. Use this path when you want:- explicit control over the model being called
- consistency for evaluation or benchmarking
- fine-grained model selection in a specific workload
- a known model target for a tightly controlled use case
What is an alias?
An alias is a gateway-defined name that maps to a model target. You can think of an alias as a product-facing contract between your application and the gateway. Your code depends on the alias. The gateway owns the actual resolution. This gives you a better operating model:- your engineering team uses stable names
- your platform team can change routing without rewriting integrations
- your product team can standardize how workloads talk about model classes
- smart-default for general-purpose chat
- cheap-fast for low-cost, low-latency tasks
- reasoning for more complex or deliberate outputs
- embed-default for embeddings
Why aliases are recommended
Aliases are the safest default for most customer integrations. Use aliases when you want to:- reduce coupling between application code and model vendors
- make future model changes less disruptive
- introduce routing or fallback behavior later
- keep your prompts and services readable
- standardize model choices across teams
When to use explicit model IDs
There are still good reasons to call a specific model directly. Use a catalog-backed model ID when:- you are comparing models side by side
- a workflow must use a specific model for regulatory, evaluation, or internal reasons
- you are running tests or benchmarks
- you want exact reproducibility for a tightly scoped integration
- you are debugging routing behavior
Listing available models
The gateway exposes a models endpoint so your application or team can discover what is currently available. Endpoint:GET /api/v1/llm/models
This endpoint returns a list of model entries that can include:
- gateway aliases
- canonical model IDs
- provider information
- request kind metadata
- alias markers
- populate a model selector in a UI
- validate model names during development
- inspect which aliases are currently exposed
- understand whether a model supports chat or embeddings
Model kinds
Agumbe separates models into two request kinds:- chat
- embeddings
Chat models
Chat models are used with:POST /api/v1/llm/chat/completions
These models support conversational or generative responses.
Embeddings models
Embeddings models are used with:POST /api/v1/llm/embeddings
These models produce vector embeddings for search, retrieval, clustering, classification, and related use cases.
If a model resolves to the wrong kind for the endpoint you are calling, the gateway rejects the request.
For example:
- a chat-only model cannot be used with the embeddings endpoint
- an embeddings-only model cannot be used with the chat completions endpoint
How model resolution works
When your request reaches the gateway, the gateway resolves the model field before making any provider call. At a high level, resolution works like this:- the gateway reads the requested model value
- it checks whether the value is an alias
- if it is an alias, the gateway resolves it to its target
- it validates whether the resolved model supports the requested endpoint
- it prepares the route plan for execution
- it forwards the request to the selected provider adapter
Routing and aliases
Aliases become even more valuable once routing behavior becomes more sophisticated. Because an alias is a stable gateway name, Agumbe can attach routing logic to it over time. That may include:- a preferred primary model
- retries
- fallbacks
- weighted candidate selection
- future reliability rules
Common alias patterns
The exact alias set in your gateway may evolve, but a typical setup includes names like these:smart-default
Use for general-purpose chat tasks where you want a strong default model. This is a good fit for:- assistants
- summaries
- classification with reasoning
- customer support workflows
- general product features
cheap-fast
Use for lightweight tasks where low latency or lower cost matters more than maximum reasoning depth. This is a good fit for:- simple rewriting
- tagging
- short transformations
- low-cost automation
- bulk operational tasks
reasoning
Use for tasks that benefit from more deliberate reasoning or stronger structured thinking. This is a good fit for:- analysis
- decision support
- complex multi-step synthesis
- workflows where answer quality matters more than speed
embed-default
Use for embeddings use cases where you want a stable vectorization default. This is a good fit for:- semantic search
- retrieval pipelines
- clustering
- recommendation systems
- document similarity
Example: using an alias in a chat request
Example: using a specific model in a chat request
Example: using an embeddings alias
Example: using a specific embeddings model
Choosing the right model strategy
For most teams, the best model strategy is simple.Start with aliases
Use aliases when:- you are integrating the gateway for the first time
- you want clean application code
- you want the gateway to own model evolution
- you want room to improve routing later
Use explicit models selectively
Use specific model IDs when:- you need exact control
- you are benchmarking
- you are validating prompt behavior across models
- you are debugging or evaluating quality differences
Models and guardrails
Model selection and guardrails work together. Guardrail policies can include an allowed model list. When this is configured, the gateway checks the requested or resolved model against the allowed set before continuing. This means:- model choice is not only a product concern
- it is also part of policy enforcement
Models and production environments
As your platform grows, model strategy often becomes environment-specific. A common pattern is:- development uses a faster or cheaper alias
- staging uses a more production-like alias
- production uses a stable default alias with stronger guardrails
Best practices
Prefer aliases for production workloads
Aliases reduce churn in your codebase and make model strategy easier to evolve.Use the models endpoint for discovery
Do not hardcode assumptions about available models if your system needs to stay in sync with the gateway catalog.Keep chat and embeddings clearly separated
Choose models that match the endpoint you are calling.Combine aliases with guardrails
Model selection becomes much safer when each app has an explicit policy around allowed models, token limits, and rate limits.Avoid scattering provider-specific names across your application
If every service chooses its own direct model ID, the platform becomes harder to standardize and govern.Common errors
If a model cannot be resolved or does not match the endpoint, the gateway returns a structured error. Example:- the requested model is unknown
- an alias points to an invalid target
- the model is blocked by guardrail allowlists
- no routing candidates are configured for the resolved model
Recommended starting point
If you are integrating Agumbe AI Gateway for the first time, start with:- smart-default for general chat
- embed-default for embeddings
- app-level guardrails
- request logs enabled in your operational workflow
- explicit model IDs only when you need direct targeting
Next steps
Once you understand model selection, the next pages to read are:- Guardrails to learn how app policies shape model usage
- Routing and Reliability to understand how model resolution and execution behave in production
- Request Logging and Observability to see how model usage appears in request records and operational flows