Skip to Content
API reference

API reference

Infer exposes an OpenAI-compatible /chat/completions endpoint. Any SDK or HTTP client that accepts a custom base URL can call it with no code changes.

Tip:

New to Infer? Run through the Quickstart first for an API key, the base URL, and a connection test. This page assumes you already have both.

Endpoint

Endpoint
POST https://api-agenthub-pre.riema.xyz/v1/chat/completions

Authentication

Pass your API key as a Bearer token in the Authorization header:

Authorization: Bearer your_api_key

Keys are scoped to a team. Create and rotate them in the API Keys dashboard.

Request body

FieldRequiredDescription
modelyesModel ID to route to, e.g. gpt-5.4. See the Models catalog.
messagesyesArray of chat messages. Must contain at least one entry.
temperaturenoSampling temperature. Defaults depend on the model.
max_tokensnoUpper bound on output tokens. On reasoning models the budget can be fully consumed by hidden reasoning tokens, which returns an empty content with finish_reason: "length".
streamnoWhen true, the response is a Server-Sent Events stream of deltas.
tools / tool_choicenoFunction calling, same schema as OpenAI.

Any other field in the OpenAI /chat/completions schema (top_p, stop, seed, response_format, …) is accepted unchanged.

Response

A non-streaming response matches the OpenAI shape:

{
"id": "chatcmpl-...",
"object": "chat.completion",
"created": 1738960610,
"model": "gpt-5.4",
"choices": [
  {
    "index": 0,
    "message": { "role": "assistant", "content": "Hello! How can I help you today?" },
    "finish_reason": "stop"
  }
],
"usage": {
  "prompt_tokens": 13,
  "completion_tokens": 9,
  "total_tokens": 22
}
}

Examples

curl
curl https://api-agenthub-pre.riema.xyz/v1/chat/completions \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your_model",
    "messages": [{ "role": "user", "content": "Hello" }]
  }'
python
from openai import OpenAI

client = OpenAI(
    api_key="your_api_key",
    base_url="https://api-agenthub-pre.riema.xyz/v1",
)

response = client.chat.completions.create(
    model="your_model",
    messages=[{"role": "user", "content": "Hello" }],
)

print(response.choices[0].message.content)
javascript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your_api_key",
  baseURL: "https://api-agenthub-pre.riema.xyz/v1",
});

const response = await client.chat.completions.create({
  model: "your_model",
  messages: [{ role: "user", content: "Hello" }],
});

console.log(response.choices[0].message.content);

Streaming

Set stream: true in the request body. The response becomes a Server-Sent Events stream. Each chunk follows the OpenAI chat.completion.chunk shape, and the stream terminates with a data: [DONE] line:

curl
curl https://api-agenthub-pre.riema.xyz/v1/chat/completions \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your_model",
    "stream": true,
    "messages": [{ "role": "user", "content": "Hello" }]
  }'
python
from openai import OpenAI

client = OpenAI(
    api_key="your_api_key",
    base_url="https://api-agenthub-pre.riema.xyz/v1",
)

stream = client.chat.completions.create(
    model="your_model",
    messages=[{"role": "user", "content": "Hello" }],
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)
javascript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your_api_key",
  baseURL: "https://api-agenthub-pre.riema.xyz/v1",
});

const stream = await client.chat.completions.create({
  model: "your_model",
  messages: [{ role: "user", content: "Hello" }],
  stream: true,
});

for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta?.content;
  if (delta) process.stdout.write(delta);
}

Error codes

CodeMeaningFix
400Invalid request, malformed JSON body, or unsupported parameter.Check the request body against the field table above and confirm the model value is copyable from the live model list.
401Invalid, missing, or revoked API key.Re-copy the key from the API Keys dashboard and confirm the Authorization header is present.
402The selected team has insufficient balance for the request.Open Billing, add funds to the team, then retry the same request.
429Rate or quota limit reached.Back off with exponential delay, reduce concurrency, or check team quota before retrying.
500Gateway or upstream model provider error.Retry after a short delay. If it persists, try another enabled model or contact support.

See also

Last updated on