Usage

The usage endpoints provide credit balance information for individual users and platform-wide analytics for administrators. All usage data is sourced from BigQuery and cached in Redis for performance.


GET /usage/me

Returns the authenticated user's current plan, credit balances, and per-model usage breakdown for the current billing cycle. Responses are cached in Redis for 60 seconds to avoid repeated BigQuery queries.

GET /usage/me

Authentication

Requires one of:

  • X-Api-Key header with a valid API key

  • Authorization and X-Firebase-AppCheck headers for browser-based auth

Parameters

This endpoint takes no parameters.

Request Example

curl -X GET https://inference-api-611798501438.us-central1.run.app/usage/me \
  -H "X-Api-Key: inf_your_api_key_here"

Response

Response Fields

Field
Type
Description

plan

string

Current subscription plan: free, guru, or pro.

plan_allocation

integer

Total monthly credits included in the plan (500, 10000, or 55000).

plan_credits_remaining

float

Remaining monthly plan credits for the current billing cycle.

topup_balance

float

Credits from one-time top-up purchases. Top-up credits do not expire at the end of the billing cycle.

this_month

object

Aggregate usage for the current billing cycle.

this_month.total_credits_used

float

Total credits consumed this billing cycle across all sources.

this_month.total_requests

integer

Total API requests this billing cycle.

by_model

array

Per-model usage breakdown with model, credits_used, and requests.

Credit Deduction Order

When a request consumes credits, the system deducts in the following order:

  1. Monthly plan credits first -- the allowance included with the user's subscription (Free: 500, Guru: 10,000, Pro: 55,000). These reset at the start of each billing cycle.

  2. Top-up balance second -- one-time purchased credits are only consumed after monthly plan credits are exhausted. Top-up credits carry over across billing cycles and do not expire.

If neither balance can cover the estimated cost, the request is rejected with a 429 status and a QUOTA_EXCEEDED error that includes the shortfall amount and links to upgrade or top-up.

Caching Behavior

The response from /usage/me is cached in Redis under the key usage_cache:{uid} with a 60-second TTL. This means:

  • After a request consumes credits, the updated balance may take up to 60 seconds to appear in /usage/me.

  • To force an immediate refresh, call POST /billing/sync-balance instead.

  • The cache is automatically invalidated after credit reconciliation events.

Error Responses

Status
Description

401

Missing or invalid authentication

500

Internal server error


Source Tracking

Every credit transaction is tagged with a source field that identifies where the request originated. This enables per-channel analytics and is visible in the admin usage dashboard.

Source
Description
Set By

web

Request from the browser-based web application (Firebase Auth + App Check).

Default for /chat requests

api_key

Request from the VSCode extension, CLI, or any client using an X-Api-Key header.

/code endpoint when auth_method == "api_key"

agent

Request from the agent executor service (Cloud Tasks / scheduled agent runs).

/internal/agent/code endpoint

Each source can also carry a source_id for more granular attribution:

  • agent source includes the task_id of the agent execution

  • api_key source includes the API key hash for per-key usage tracking


Admin Endpoints

The following endpoints require the authenticated user to have the admin role set in their Firestore user document. Admin endpoints are accessible only via Firebase Auth (not API key).

GET /admin/usage

Returns platform-wide analytics including total requests, total credits consumed, active user counts, daily trends, and top models by usage.

Authentication

Requires Firebase Auth with admin role (Authorization + X-Firebase-AppCheck headers).

Request Example

Response

Response Fields

Field
Type
Description

period

string

The billing period in YYYY-MM format.

total_requests

integer

Total API requests across all users in the period.

total_credits_consumed

integer

Total credits consumed across all users in the period.

active_users

integer

Number of unique users who made at least one request.

requests_by_day

array

Daily breakdown with date, requests, and credits.

top_models

array

Models ranked by request count with model, requests, and credits.

Error Responses

Status
Description

401

Missing or invalid authentication

403

User does not have admin role

500

Internal server error


GET /admin/model-economics

Returns per-model cost versus revenue breakdown, showing the platform's margin on each model. Data is sourced from BigQuery usage events and provider cost tables.

Authentication

Requires Firebase Auth with admin role (Authorization + X-Firebase-AppCheck headers).

Request Example

Response

Model Economics Object

Field
Type
Description

model

string

Model identifier.

provider

string

Upstream provider name (anthropic, openai, vertex, openrouter).

total_requests

integer

Total requests to this model in the period.

total_input_tokens

integer

Total input tokens processed.

total_output_tokens

integer

Total output tokens generated.

provider_cost_usd

float

Total cost paid to the upstream provider in USD.

revenue_credits

integer

Total credits charged to users.

revenue_usd

float

Total revenue in USD (credits / 1000).

margin_pct

float

Profit margin percentage: (revenue - cost) / revenue * 100.

How Margins Work

The platform applies a 1.4x margin to raw provider costs for all text-based models, with an additional 1.15x multiplier for image generation models. The margin_pct field in the response reflects the realized margin after accounting for actual token usage patterns.

Credits are calculated as:

Where input_price and output_price are the provider's per-token costs in USD. For image models, a flat per-image credit cost is applied instead of per-token pricing.

Error Responses

Status
Description

401

Missing or invalid authentication

403

User does not have admin role

500

Internal server error

Last updated