Models and Aliases

Agumbe AI Gateway gives you a stable way to work with language models without forcing your application to depend on provider-specific naming or routing behavior. When your application sends a request to the gateway, it includes a model value. That value can be either:

an Agumbe alias
a catalog-backed model ID

This design gives teams two useful modes of operation:

stable application-facing names for day-to-day development
explicit model targeting when you need direct control

For most teams, aliases are the best starting point. They keep your application code simple and make it easier to evolve routing and model strategy over time.

Why this matters

Without a gateway, model names tend to leak directly into application code. Over time, that creates friction:

changing models requires code changes across services
production traffic becomes tightly coupled to one provider’s naming scheme
experimenting with fallbacks or routing becomes harder
business teams and platform teams lose a shared vocabulary for “default” or “reasoning” behavior

Agumbe solves this by separating what your application asks for from how the platform fulfills it. Your application can ask for a stable name such as smart-default, and the gateway can resolve that to the right model and provider behind the scenes.

Two ways to select a model

Use an alias

An alias is a stable, gateway-defined model name. Aliases are ideal when you want:

a clean integration experience
stable application code
room to evolve routing later
a platform-controlled default for quality, speed, or reasoning

Examples:

{
  "model": "smart-default"
}

{
  "model": "cheap-fast"
}

{
  "model": "reasoning"
}

{
  "model": "embed-default"
}

Use a catalog-backed model ID

A catalog-backed model ID targets a specific model entry exposed by the gateway. Use this path when you want:

explicit control over the model being called
consistency for evaluation or benchmarking
fine-grained model selection in a specific workload
a known model target for a tightly controlled use case

Example:

{
  "model": "@anthropic/claude-sonnet-4"
}

Or:

{
  "model": "@openai/gpt-5.2"
}

Agumbe also supports catalog-style convenience names that resolve through its alias layer, such as:

{
  "model": "gpt-5.2"
}

What is an alias?

An alias is a gateway-defined name that maps to a model target. You can think of an alias as a product-facing contract between your application and the gateway. Your code depends on the alias. The gateway owns the actual resolution. This gives you a better operating model:

your engineering team uses stable names
your platform team can change routing without rewriting integrations
your product team can standardize how workloads talk about model classes

For example, an application might use:

smart-default for general-purpose chat
cheap-fast for low-cost, low-latency tasks
reasoning for more complex or deliberate outputs
embed-default for embeddings

The meaning stays stable even if the exact backing model changes over time.

Why aliases are recommended

Aliases are the safest default for most customer integrations. Use aliases when you want to:

reduce coupling between application code and model vendors
make future model changes less disruptive
introduce routing or fallback behavior later
keep your prompts and services readable
standardize model choices across teams

For example, this is easier to reason about in a production codebase:

{
  "model": "smart-default"
}

than embedding a provider-specific model name everywhere. Aliases are especially useful for teams with multiple services, multiple environments, or evolving model strategy.

When to use explicit model IDs

There are still good reasons to call a specific model directly. Use a catalog-backed model ID when:

you are comparing models side by side
a workflow must use a specific model for regulatory, evaluation, or internal reasons
you are running tests or benchmarks
you want exact reproducibility for a tightly scoped integration
you are debugging routing behavior

In other words, explicit model IDs are best when precision matters more than flexibility.

Listing available models

The gateway exposes a models endpoint so your application or team can discover what is currently available. Endpoint: GET /api/v1/llm/models This endpoint returns a list of model entries that can include:

gateway aliases
canonical model IDs
provider information
request kind metadata
alias markers

Use this endpoint when you want to:

populate a model selector in a UI
validate model names during development
inspect which aliases are currently exposed
understand whether a model supports chat or embeddings

Model kinds

Agumbe separates models into two request kinds:

chat
embeddings

This distinction matters because not every model can be used with every endpoint.

Chat models

Chat models are used with: POST /api/v1/llm/chat/completions These models support conversational or generative responses.

Embeddings models

Embeddings models are used with: POST /api/v1/llm/embeddings These models produce vector embeddings for search, retrieval, clustering, classification, and related use cases. If a model resolves to the wrong kind for the endpoint you are calling, the gateway rejects the request. For example:

a chat-only model cannot be used with the embeddings endpoint
an embeddings-only model cannot be used with the chat completions endpoint

This protects applications from sending invalid traffic.

How model resolution works

When your request reaches the gateway, the gateway resolves the model field before making any provider call. At a high level, resolution works like this:

the gateway reads the requested model value
it checks whether the value is an alias
if it is an alias, the gateway resolves it to its target
it validates whether the resolved model supports the requested endpoint
it prepares the route plan for execution
it forwards the request to the selected provider adapter

This means your application does not need to understand provider-specific request routing. The gateway handles that part for you.

Routing and aliases

Aliases become even more valuable once routing behavior becomes more sophisticated. Because an alias is a stable gateway name, Agumbe can attach routing logic to it over time. That may include:

a preferred primary model
retries
fallbacks
weighted candidate selection
future reliability rules

This is one of the strongest reasons to use aliases in production. They give the platform room to improve reliability and model strategy without changing your application contract.

Common alias patterns

The exact alias set in your gateway may evolve, but a typical setup includes names like these:

smart-default

Use for general-purpose chat tasks where you want a strong default model. This is a good fit for:

assistants
summaries
classification with reasoning
customer support workflows
general product features

cheap-fast

Use for lightweight tasks where low latency or lower cost matters more than maximum reasoning depth. This is a good fit for:

simple rewriting
tagging
short transformations
low-cost automation
bulk operational tasks

reasoning

Use for tasks that benefit from more deliberate reasoning or stronger structured thinking. This is a good fit for:

analysis
decision support
complex multi-step synthesis
workflows where answer quality matters more than speed

embed-default

Use for embeddings use cases where you want a stable vectorization default. This is a good fit for:

semantic search
retrieval pipelines
clustering
recommendation systems
document similarity

Example: using an alias in a chat request

{
  "model": "smart-default",
  "messages": [
    {
      "role": "system",
      "content": "You are a concise assistant."
    },
    {
      "role": "user",
      "content": "Summarize this support ticket in one paragraph."
    }
  ],
  "max_completion_tokens": 180
}

This is the recommended pattern for most product traffic.

Example: using a specific model in a chat request

{
  "model": "@anthropic/claude-sonnet-4",
  "messages": [
    {
      "role": "user",
      "content": "Compare the tradeoffs between two architecture options."
    }
  ],
  "max_completion_tokens": 300
}

This is useful when you want direct model targeting.

Example: using an embeddings alias

{
  "model": "embed-default",
  "input": "Agumbe AI Gateway centralizes model access, guardrails, and observability."
}

Example: using a specific embeddings model

{
  "model": "@openai/text-embedding-3-small",
  "input": "Agumbe AI Gateway centralizes model access, guardrails, and observability."
}

Choosing the right model strategy

For most teams, the best model strategy is simple.

Start with aliases

Use aliases when:

you are integrating the gateway for the first time
you want clean application code
you want the gateway to own model evolution
you want room to improve routing later

Use explicit models selectively

Use specific model IDs when:

you need exact control
you are benchmarking
you are validating prompt behavior across models
you are debugging or evaluating quality differences

A good long-term pattern is to use aliases for production application traffic and explicit models for internal testing or experimentation.

Models and guardrails

Model selection and guardrails work together. Guardrail policies can include an allowed model list. When this is configured, the gateway checks the requested or resolved model against the allowed set before continuing. This means:

model choice is not only a product concern
it is also part of policy enforcement

For example, a team might allow only a small set of approved models for a given app. In that setup, the gateway rejects requests that attempt to use disallowed models, even if the caller is otherwise authenticated. This helps teams control risk, quality, and cost at the app level.

Models and production environments

As your platform grows, model strategy often becomes environment-specific. A common pattern is:

development uses a faster or cheaper alias
staging uses a more production-like alias
production uses a stable default alias with stronger guardrails

This works well because aliases let you keep application logic consistent while changing the backing model strategy per environment if needed.

Best practices

Prefer aliases for production workloads

Aliases reduce churn in your codebase and make model strategy easier to evolve.

Use the models endpoint for discovery

Do not hardcode assumptions about available models if your system needs to stay in sync with the gateway catalog.

Keep chat and embeddings clearly separated

Choose models that match the endpoint you are calling.

Combine aliases with guardrails

Model selection becomes much safer when each app has an explicit policy around allowed models, token limits, and rate limits.

Avoid scattering provider-specific names across your application

If every service chooses its own direct model ID, the platform becomes harder to standardize and govern.

Common errors

If a model cannot be resolved or does not match the endpoint, the gateway returns a structured error. Example:

{
  "error": {
    "message": "Model embed-default resolves to embeddings and cannot be used with the chat endpoint",
    "type": "invalid_request_error",
    "param": "model",
    "code": "invalid_model"
  }
}

You may also see a model-related error if:

the requested model is unknown
an alias points to an invalid target
the model is blocked by guardrail allowlists
no routing candidates are configured for the resolved model

Recommended starting point

If you are integrating Agumbe AI Gateway for the first time, start with:

smart-default for general chat
embed-default for embeddings
app-level guardrails
request logs enabled in your operational workflow
explicit model IDs only when you need direct targeting

This gives you the cleanest balance of simplicity, flexibility, and long-term maintainability.

Next steps

Once you understand model selection, the next pages to read are:

Guardrails to learn how app policies shape model usage
Routing and Reliability to understand how model resolution and execution behave in production
Request Logging and Observability to see how model usage appears in request records and operational flows

Start here

Product

Cookbooks

Why this matters

Two ways to select a model

Use an alias

Use a catalog-backed model ID

What is an alias?

Why aliases are recommended

When to use explicit model IDs

Listing available models

Model kinds

Chat models

Embeddings models

How model resolution works

Routing and aliases

Common alias patterns

smart-default

cheap-fast

reasoning

embed-default

Example: using an alias in a chat request

Example: using a specific model in a chat request

Example: using an embeddings alias

Example: using a specific embeddings model

Choosing the right model strategy

Start with aliases

Use explicit models selectively

Models and guardrails

Models and production environments

Best practices

Prefer aliases for production workloads

Use the models endpoint for discovery

Keep chat and embeddings clearly separated

Combine aliases with guardrails

Avoid scattering provider-specific names across your application

Common errors

Recommended starting point

Next steps

Start here

Product

Cookbooks

Documentation Index

​Why this matters

​Two ways to select a model

​Use an alias

​Use a catalog-backed model ID

​What is an alias?

​Why aliases are recommended

​When to use explicit model IDs

​Listing available models

​Model kinds

​Chat models

​Embeddings models

​How model resolution works

​Routing and aliases

​Common alias patterns

​smart-default

​cheap-fast

​reasoning

​embed-default

​Example: using an alias in a chat request

​Example: using a specific model in a chat request

​Example: using an embeddings alias

​Example: using a specific embeddings model

​Choosing the right model strategy

​Start with aliases

​Use explicit models selectively

​Models and guardrails

​Models and production environments

​Best practices

​Prefer aliases for production workloads

​Use the models endpoint for discovery

​Keep chat and embeddings clearly separated

​Combine aliases with guardrails

​Avoid scattering provider-specific names across your application

​Common errors

​Recommended starting point

​Next steps

Why this matters

Two ways to select a model

Use an alias

Use a catalog-backed model ID

What is an alias?

Why aliases are recommended

When to use explicit model IDs

Listing available models

Model kinds

Chat models

Embeddings models

How model resolution works

Routing and aliases

Common alias patterns

smart-default

cheap-fast

reasoning

embed-default

Example: using an alias in a chat request

Example: using a specific model in a chat request

Example: using an embeddings alias

Example: using a specific embeddings model

Choosing the right model strategy

Start with aliases

Use explicit models selectively

Models and guardrails

Models and production environments

Best practices

Prefer aliases for production workloads

Use the models endpoint for discovery

Keep chat and embeddings clearly separated

Combine aliases with guardrails

Avoid scattering provider-specific names across your application

Common errors

Recommended starting point

Next steps