Typescript - Agumbe AI Gateway

This page shows how to integrate Agumbe AI Gateway from a TypeScript application. Agumbe AI Gateway exposes a provider-compatible API surface, which means you can use a familiar TypeScript SDK pattern and point it to the Agumbe base URL. This makes it easy to get started quickly while still keeping model routing, guardrails, request logging, and observability inside the gateway. For production use, the recommended pattern is to call the gateway from your backend service, not directly from the browser.

Before you begin

Make sure you have:

an Agumbe Gateway API key
the Agumbe base URL
a backend or server-side TypeScript environment

Base URL: https://api.agumbe.ai/api/v1/llm Set your API key as an environment variable: export AGUMBE_API_KEY="your_agumbe_gateway_api_key"

Install the SDK

npm install openai Even though the package name is openai, you are pointing it at Agumbe AI Gateway, not directly at any one model provider.

Create a client

import OpenAI from "openai";

export const agumbe = new OpenAI({
  apiKey: process.env.AGUMBE_API_KEY,
  baseURL: "https://api.agumbe.ai/api/v1/llm",
});

This client becomes your single entry point for chat and embeddings requests.

Send your first chat request

import { agumbe } from "./client";

async function main() {
  const response = await agumbe.chat.completions.create({
    model: "smart-default",
    messages: [
      {
        role: "system",
        content: "You are a concise assistant.",
      },
      {
        role: "user",
        content: "Explain what an AI gateway does in one paragraph.",
      },
    ],
    max_completion_tokens: 220,
  });

  console.log(response.choices[0]?.message?.content ?? "");
}

main().catch(console.error);

This is the recommended starting pattern for most teams:

use a stable alias such as smart-default
keep the integration server-side
start simple
evolve routing and guardrails later through the gateway

Send an embeddings request

import { agumbe } from "./client";

async function main() {
  const response = await agumbe.embeddings.create({
    model: "embed-default",
    input: "Agumbe AI Gateway helps teams route, govern, and observe AI traffic.",
  });

  console.log(response.data[0]?.embedding);
}

main().catch(console.error);

Use embeddings when you are building search, retrieval, classification, clustering, or similarity workflows.

Use a specific app policy

If you are using a tenant-scoped API key, you can choose which app’s guardrails apply by sending agumbe_guardrails_app_id.

import { agumbe } from "./client";

async function main() {
  const response = await agumbe.chat.completions.create({
    model: "smart-default",
    messages: [
      {
        role: "user",
        content: "Draft a safe reply to this customer message.",
      },
    ],
    max_completion_tokens: 180,
    agumbe_guardrails_app_id: "app_support",
  } as any);

  console.log(response.choices[0]?.message?.content ?? "");
}

main().catch(console.error);

If you are using an app-scoped API key, the gateway applies the bound app policy automatically, so you usually do not need to pass this field.

Attach request metadata

Agumbe supports request metadata that makes logs and observability much more useful. You can attach metadata such as:

workspace_id
xnamespace_id
source_service
operation
external_request_id

Example:

import { agumbe } from "./client";

async function main() {
  const response = await agumbe.chat.completions.create({
    model: "smart-default",
    messages: [
      {
        role: "user",
        content: "Summarize this support ticket.",
      },
    ],
    max_completion_tokens: 180,
    agumbe_guardrails_app_id: "app_support",
    agumbe_metadata: {
      workspace_id: "workspace_123",
      source_service: "support-api",
      operation: "ticket_summary",
      external_request_id: "ticket_789",
    },
  } as any);

  console.log(response.choices[0]?.message?.content ?? "");
}

main().catch(console.error);

This is especially useful in production, where teams need to connect gateway traffic back to internal systems and workflows.

Use grounding context

If groundedness checks are part of your app policy, you can send grounding context with the request.

import { agumbe } from "./client";

async function main() {
  const response = await agumbe.chat.completions.create({
    model: "smart-default",
    messages: [
      {
        role: "user",
        content: "Answer this question using the supplied refund policy.",
      },
    ],
    agumbe_guardrails_app_id: "app_support",
    agumbe_grounding_context: [
      "Refunds are available within 14 days of purchase.",
      "Support agents must not promise exceptions outside the published refund policy.",
    ],
  } as any);

  console.log(response.choices[0]?.message?.content ?? "");
}

main().catch(console.error);

Read the response

A successful chat response includes the generated content and token usage.

const response = await agumbe.chat.completions.create({
  model: "smart-default",
  messages: [
    {
      role: "user",
      content: "Write a one-line summary of this ticket.",
    },
  ],
});

const text = response.choices[0]?.message?.content ?? "";
const usage = response.usage;

console.log(text);
console.log(usage);

You can also inspect gateway-specific response metadata in your HTTP layer, such as timing headers and estimated cost headers, if your client stack exposes raw response headers.

Recommended project structure

A simple server-side structure works well for most teams. src/
lib/
agumbe.ts
services/
summarizer.ts
routes/
support.ts\ Example client module:

import OpenAI from "openai";

export const agumbe = new OpenAI({
  apiKey: process.env.AGUMBE_API_KEY,
  baseURL: "https://api.agumbe.ai/api/v1/llm",
});

Example service module:

import { agumbe } from "../lib/agumbe";

export async function summarizeTicket(ticketText: string) {
  const response = await agumbe.chat.completions.create({
    model: "smart-default",
    messages: [
      {
        role: "system",
        content: "You summarize support tickets for an operations team.",
      },
      {
        role: "user",
        content: ticketText,
      },
    ],
    max_completion_tokens: 180,
    agumbe_guardrails_app_id: "app_support",
  } as any);

  return response.choices[0]?.message?.content ?? "";
}

This pattern keeps your gateway setup centralized and makes the rest of the codebase easier to maintain.

Error handling

Agumbe returns structured errors. In TypeScript, you should catch errors and handle them deliberately.

import { agumbe } from "./client";

async function main() {
  try {
    const response = await agumbe.chat.completions.create({
      model: "smart-default",
      messages: [
        {
          role: "user",
          content: "Explain AI gateways briefly.",
        },
      ],
    });

    console.log(response.choices[0]?.message?.content ?? "");
  } catch (error: any) {
    console.error("Gateway request failed");

    if (error?.status) {
      console.error("Status:", error.status);
    }

    if (error?.error) {
      console.error("Gateway error:", error.error);
    } else {
      console.error(error);
    }
  }
}

main();

Typical failure cases include:

invalid credentials
invalid model selection
app mismatch
guardrail policy blocks
rate limits
upstream timeout or provider failure

Production recommendations

When integrating from TypeScript, follow these guidelines:

keep the API key in server-side environment variables
do not expose the key in frontend bundles
prefer aliases such as smart-default and embed-default
use app-scoped keys when the workload is fixed
attach request metadata for important workflows
send traffic through your backend or service layer
start with a small number of stable integration patterns

A good first production setup is usually:

one backend service
one app-scoped key
one chat alias
one embeddings alias
one app policy
request metadata on every important workflow

SDKs

Documentation Index

​Before you begin

​Install the SDK

​Create a client

​Send your first chat request

​Send an embeddings request

​Use a specific app policy

​Attach request metadata

​Use grounding context

​Read the response

​Recommended project structure

​Error handling

​Production recommendations