Chat

Chat completions endpoint. Sends a conversation to an AI model and returns the response as a Server-Sent Event (SSE) stream or a single JSON object.

POST /chat

Authentication

Requires one of:

  • X-Api-Key header with a valid API key

  • Authorization and X-Firebase-AppCheck headers for browser-based auth


Request Body

Parameter
Type
Required
Default
Description

model

string

No

Auto-routed

Model ID to use. If omitted, the platform auto-selects based on query complexity.

messages

array

Yes

--

Array of message objects forming the conversation.

stream

boolean

No

true

Whether to stream the response via SSE.

temperature

float

No

0.7

Sampling temperature. Range: 0.0 to 2.0.

max_tokens

integer

No

Model default

Maximum tokens to generate. Clamped to the model's maximum output limit.

session_id

string

No

--

Resume an existing conversation session.

Message Object

Field
Type
Required
Description

role

string

Yes

One of system, user, or assistant.

content

string

Yes

The message content.


Request Example


Response (Streaming)

When stream is true (the default), the response is an SSE stream with Content-Type: text/event-stream.

Event Types

chunk -- Incremental text content:

done -- Stream complete with usage metadata:

error -- Error during generation:


Response (Non-Streaming)

When stream is false, the response is a single JSON object:


Auto-Routing

When the model parameter is omitted, the platform automatically selects a model based on the content and complexity of the query. This is useful for general-purpose applications that do not need to target a specific model.


Credit Billing

Credits are handled using a reservation-and-reconciliation model:

  1. Reservation: Before generation begins, the platform reserves credits based on a worst-case estimate (maximum possible output tokens for the selected model).

  2. Generation: The model produces the response.

  3. Reconciliation: After generation completes, the reserved amount is adjusted to reflect the actual token usage. Unused reserved credits are returned to the user's balance.

If the user's credit balance is insufficient to cover the worst-case reservation, the request is rejected with a 403 status.


Error Responses

Status
Description

400

Invalid request body or parameters

401

Missing or invalid authentication

403

Insufficient credits for the request

404

Specified model not found

429

Rate limit exceeded

500

Provider or internal server error

Last updated