Skip to main content

Documentation Index

Fetch the complete documentation index at: https://agumbe.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

This page shows how to integrate Agumbe AI Gateway from a TypeScript application. Agumbe AI Gateway exposes a provider-compatible API surface, which means you can use a familiar TypeScript SDK pattern and point it to the Agumbe base URL. This makes it easy to get started quickly while still keeping model routing, guardrails, request logging, and observability inside the gateway. For production use, the recommended pattern is to call the gateway from your backend service, not directly from the browser.

Before you begin

Make sure you have:
  • an Agumbe Gateway API key
  • the Agumbe base URL
  • a backend or server-side TypeScript environment
Base URL: https://api.agumbe.ai/api/v1/llm Set your API key as an environment variable: export AGUMBE_API_KEY="your_agumbe_gateway_api_key"

Install the SDK

npm install openai Even though the package name is openai, you are pointing it at Agumbe AI Gateway, not directly at any one model provider.

Create a client

import OpenAI from "openai";

export const agumbe = new OpenAI({
  apiKey: process.env.AGUMBE_API_KEY,
  baseURL: "https://api.agumbe.ai/api/v1/llm",
});
This client becomes your single entry point for chat and embeddings requests.

Send your first chat request

import { agumbe } from "./client";

async function main() {
  const response = await agumbe.chat.completions.create({
    model: "smart-default",
    messages: [
      {
        role: "system",
        content: "You are a concise assistant.",
      },
      {
        role: "user",
        content: "Explain what an AI gateway does in one paragraph.",
      },
    ],
    max_completion_tokens: 220,
  });

  console.log(response.choices[0]?.message?.content ?? "");
}

main().catch(console.error);
This is the recommended starting pattern for most teams:
  • use a stable alias such as smart-default
  • keep the integration server-side
  • start simple
  • evolve routing and guardrails later through the gateway

Send an embeddings request

import { agumbe } from "./client";

async function main() {
  const response = await agumbe.embeddings.create({
    model: "embed-default",
    input: "Agumbe AI Gateway helps teams route, govern, and observe AI traffic.",
  });

  console.log(response.data[0]?.embedding);
}

main().catch(console.error);
Use embeddings when you are building search, retrieval, classification, clustering, or similarity workflows.

Use a specific app policy

If you are using a tenant-scoped API key, you can choose which app’s guardrails apply by sending agumbe_guardrails_app_id.
import { agumbe } from "./client";

async function main() {
  const response = await agumbe.chat.completions.create({
    model: "smart-default",
    messages: [
      {
        role: "user",
        content: "Draft a safe reply to this customer message.",
      },
    ],
    max_completion_tokens: 180,
    agumbe_guardrails_app_id: "app_support",
  } as any);

  console.log(response.choices[0]?.message?.content ?? "");
}

main().catch(console.error);
If you are using an app-scoped API key, the gateway applies the bound app policy automatically, so you usually do not need to pass this field.

Attach request metadata

Agumbe supports request metadata that makes logs and observability much more useful. You can attach metadata such as:
  • workspace_id
  • xnamespace_id
  • source_service
  • operation
  • external_request_id
Example:
import { agumbe } from "./client";

async function main() {
  const response = await agumbe.chat.completions.create({
    model: "smart-default",
    messages: [
      {
        role: "user",
        content: "Summarize this support ticket.",
      },
    ],
    max_completion_tokens: 180,
    agumbe_guardrails_app_id: "app_support",
    agumbe_metadata: {
      workspace_id: "workspace_123",
      source_service: "support-api",
      operation: "ticket_summary",
      external_request_id: "ticket_789",
    },
  } as any);

  console.log(response.choices[0]?.message?.content ?? "");
}

main().catch(console.error);
This is especially useful in production, where teams need to connect gateway traffic back to internal systems and workflows.

Use grounding context

If groundedness checks are part of your app policy, you can send grounding context with the request.
import { agumbe } from "./client";

async function main() {
  const response = await agumbe.chat.completions.create({
    model: "smart-default",
    messages: [
      {
        role: "user",
        content: "Answer this question using the supplied refund policy.",
      },
    ],
    agumbe_guardrails_app_id: "app_support",
    agumbe_grounding_context: [
      "Refunds are available within 14 days of purchase.",
      "Support agents must not promise exceptions outside the published refund policy.",
    ],
  } as any);

  console.log(response.choices[0]?.message?.content ?? "");
}

main().catch(console.error);

Read the response

A successful chat response includes the generated content and token usage.
const response = await agumbe.chat.completions.create({
  model: "smart-default",
  messages: [
    {
      role: "user",
      content: "Write a one-line summary of this ticket.",
    },
  ],
});

const text = response.choices[0]?.message?.content ?? "";
const usage = response.usage;

console.log(text);
console.log(usage);
You can also inspect gateway-specific response metadata in your HTTP layer, such as timing headers and estimated cost headers, if your client stack exposes raw response headers. A simple server-side structure works well for most teams. src/
lib/
agumbe.ts
services/
summarizer.ts
routes/
support.ts\
Example client module:
import OpenAI from "openai";

export const agumbe = new OpenAI({
  apiKey: process.env.AGUMBE_API_KEY,
  baseURL: "https://api.agumbe.ai/api/v1/llm",
});
Example service module:
import { agumbe } from "../lib/agumbe";

export async function summarizeTicket(ticketText: string) {
  const response = await agumbe.chat.completions.create({
    model: "smart-default",
    messages: [
      {
        role: "system",
        content: "You summarize support tickets for an operations team.",
      },
      {
        role: "user",
        content: ticketText,
      },
    ],
    max_completion_tokens: 180,
    agumbe_guardrails_app_id: "app_support",
  } as any);

  return response.choices[0]?.message?.content ?? "";
}
This pattern keeps your gateway setup centralized and makes the rest of the codebase easier to maintain.

Error handling

Agumbe returns structured errors. In TypeScript, you should catch errors and handle them deliberately.
import { agumbe } from "./client";

async function main() {
  try {
    const response = await agumbe.chat.completions.create({
      model: "smart-default",
      messages: [
        {
          role: "user",
          content: "Explain AI gateways briefly.",
        },
      ],
    });

    console.log(response.choices[0]?.message?.content ?? "");
  } catch (error: any) {
    console.error("Gateway request failed");

    if (error?.status) {
      console.error("Status:", error.status);
    }

    if (error?.error) {
      console.error("Gateway error:", error.error);
    } else {
      console.error(error);
    }
  }
}

main();
Typical failure cases include:
  • invalid credentials
  • invalid model selection
  • app mismatch
  • guardrail policy blocks
  • rate limits
  • upstream timeout or provider failure

Production recommendations

When integrating from TypeScript, follow these guidelines:
  • keep the API key in server-side environment variables
  • do not expose the key in frontend bundles
  • prefer aliases such as smart-default and embed-default
  • use app-scoped keys when the workload is fixed
  • attach request metadata for important workflows
  • send traffic through your backend or service layer
  • start with a small number of stable integration patterns
A good first production setup is usually:
  • one backend service
  • one app-scoped key
  • one chat alias
  • one embeddings alias
  • one app policy
  • request metadata on every important workflow