Code (Agentic)

Agentic tool-use endpoint for coding assistants. Functions identically to /chat but additionally supports tool_use events in the SSE stream, enabling the client to execute tools on behalf of the model.

POST /code

This endpoint is the primary interface for the VSCode extension and other agentic coding clients. The model may request tool calls (file reads, searches, terminal commands, etc.), and the client is responsible for executing those tools and returning results in subsequent requests.

Note: This differs from the internal /internal/agent/code endpoint, where the server executes tools autonomously. On /code, tool execution is always delegated to the client.

Authentication

Requires one of:

X-Api-Key header with a valid API key
Authorization and X-Firebase-AppCheck headers for browser-based auth

Request Body

Parameter

Type

Required

Default

Description

model

string

Auto-routed

Model ID to use. Must be a model that supports tool use.

messages

array

Yes

Array of message objects, including tool results from prior turns.

stream

boolean

true

Whether to stream the response via SSE.

temperature

float

0.7

Sampling temperature. Range: 0.0 to 2.0.

max_tokens

integer

Model default

Maximum tokens to generate. Clamped to the model's maximum output limit.

session_id

string

Resume an existing conversation session.

tools

array

Tool definitions available for the model to call.

Message Object

Messages follow the Anthropic tool-use conversation format:

Field

Type

Required

Description

role

string

Yes

One of system, user, assistant, or tool.

content

string or array

Yes

Text content or array of content blocks (for tool results).

tool_use_id

string

Conditional

Required when role is tool. The ID of the tool call being responded to.

Tool Definition Object

Field

Type

Required

Description

name

string

Yes

Unique tool name (e.g., read_file, run_command).

description

string

Yes

Description of what the tool does.

input_schema

object

Yes

JSON Schema defining the tool's input parameters.

Request Example

Initial request with tool definitions:

curl -X POST https://inference-api-611798501438.us-central1.run.app/code \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: inf_your_api_key_here" \
  -d '{
    "model": "claude-sonnet",
    "messages": [
      {"role": "user", "content": "Read the contents of main.py and summarize it."}
    ],
    "tools": [
      {
        "name": "read_file",
        "description": "Read the contents of a file at the given path.",
        "input_schema": {
          "type": "object",
          "properties": {
            "path": {"type": "string", "description": "Absolute file path"}
          },
          "required": ["path"]
        }
      }
    ],
    "stream": true
  }'

Follow-up request with tool result:

curl -X POST https://inference-api-611798501438.us-central1.run.app/code \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: inf_your_api_key_here" \
  -d '{
    "model": "claude-sonnet",
    "session_id": "abc123",
    "messages": [
      {"role": "user", "content": "Read the contents of main.py and summarize it."},
      {"role": "assistant", "content": [
        {"type": "text", "text": "I will read the file for you."},
        {"type": "tool_use", "id": "tool_01", "name": "read_file", "input": {"path": "/app/main.py"}}
      ]},
      {"role": "tool", "tool_use_id": "tool_01", "content": "import fastapi\n\napp = FastAPI()\n\[email protected](\"/\")\ndef root():\n    return {\"status\": \"ok\"}"}
    ],
    "tools": [
      {
        "name": "read_file",
        "description": "Read the contents of a file at the given path.",
        "input_schema": {
          "type": "object",
          "properties": {
            "path": {"type": "string", "description": "Absolute file path"}
          },
          "required": ["path"]
        }
      }
    ],
    "stream": true
  }'

Response (Streaming)

The SSE stream may include the following event types:

`chunk` -- Incremental text content

data: {"type": "chunk", "content": "I'll read the file for you."}

`tool_use` -- Model requests a tool call

data: {"type": "tool_use", "id": "tool_01", "name": "read_file", "input": {"path": "/app/main.py"}}

When the client receives a tool_use event, it should:

Execute the requested tool locally.
Send a new request to /code with the tool result appended to the messages array.
Continue until the model produces a final text response without further tool calls.

`done` -- Stream complete

data: {"type": "done", "model": "claude-sonnet", "session_id": "abc123", "credits_used": 34, "input_tokens": 128, "output_tokens": 210}

`error` -- Error during generation

data: {"type": "error", "message": "Model returned an error"}

Agentic Conversation Loop

A typical agentic session follows this pattern:

Client                              Server
  |                                    |
  |  POST /code (user message)         |
  |  --------------------------------> |
  |                                    |
  |  SSE: chunk, chunk, tool_use       |
  |  <-------------------------------- |
  |                                    |
  |  [Client executes tool locally]    |
  |                                    |
  |  POST /code (with tool result)     |
  |  --------------------------------> |
  |                                    |
  |  SSE: chunk, chunk, done           |
  |  <-------------------------------- |

Each round-trip within the same session is billed separately. The session_id returned in the done event should be passed in subsequent requests to maintain conversation context.

Credit Billing

Credit billing follows the same reservation-and-reconciliation model as /chat. Each round-trip (request/response pair) is billed independently based on the tokens consumed in that exchange.

Error Responses

Status

Description

400

Invalid request body, parameters, or tool definitions

401

Missing or invalid authentication

403

Insufficient credits or selected model does not support tool use

404

Specified model not found

429

Rate limit exceeded

500

Provider or internal server error

PreviousChat NextModels

Last updated 1 hour ago

Good morning

hashtagAuthentication

hashtagRequest Body

hashtagMessage Object

hashtagTool Definition Object

hashtagRequest Example

hashtagInitial request with tool definitions:

hashtagFollow-up request with tool result:

hashtagResponse (Streaming)

hashtagchunk -- Incremental text content

hashtagtool_use -- Model requests a tool call

hashtagdone -- Stream complete

hashtagerror -- Error during generation

hashtagAgentic Conversation Loop

hashtagCredit Billing

hashtagError Responses

Authentication

Request Body

Message Object

Tool Definition Object

Request Example

Initial request with tool definitions:

Follow-up request with tool result:

Response (Streaming)

`chunk` -- Incremental text content

`tool_use` -- Model requests a tool call

`done` -- Stream complete

`error` -- Error during generation

Agentic Conversation Loop

Credit Billing

Error Responses