Usage
The usage endpoints provide credit balance information for individual users and platform-wide analytics for administrators. All usage data is sourced from BigQuery and cached in Redis for performance.
GET /usage/me
Returns the authenticated user's current plan, credit balances, and per-model usage breakdown for the current billing cycle. Responses are cached in Redis for 60 seconds to avoid repeated BigQuery queries.
GET /usage/meAuthentication
Requires one of:
X-Api-Keyheader with a valid API keyAuthorizationandX-Firebase-AppCheckheaders for browser-based auth
Parameters
This endpoint takes no parameters.
Request Example
curl -X GET https://inference-api-611798501438.us-central1.run.app/usage/me \
-H "X-Api-Key: inf_your_api_key_here"Response
Response Fields
plan
string
Current subscription plan: free, guru, or pro.
plan_allocation
integer
Total monthly credits included in the plan (500, 10000, or 55000).
plan_credits_remaining
float
Remaining monthly plan credits for the current billing cycle.
topup_balance
float
Credits from one-time top-up purchases. Top-up credits do not expire at the end of the billing cycle.
this_month
object
Aggregate usage for the current billing cycle.
this_month.total_credits_used
float
Total credits consumed this billing cycle across all sources.
this_month.total_requests
integer
Total API requests this billing cycle.
by_model
array
Per-model usage breakdown with model, credits_used, and requests.
Credit Deduction Order
When a request consumes credits, the system deducts in the following order:
Monthly plan credits first -- the allowance included with the user's subscription (Free: 500, Guru: 10,000, Pro: 55,000). These reset at the start of each billing cycle.
Top-up balance second -- one-time purchased credits are only consumed after monthly plan credits are exhausted. Top-up credits carry over across billing cycles and do not expire.
If neither balance can cover the estimated cost, the request is rejected with a 429 status and a QUOTA_EXCEEDED error that includes the shortfall amount and links to upgrade or top-up.
Caching Behavior
The response from /usage/me is cached in Redis under the key usage_cache:{uid} with a 60-second TTL. This means:
After a request consumes credits, the updated balance may take up to 60 seconds to appear in
/usage/me.To force an immediate refresh, call
POST /billing/sync-balanceinstead.The cache is automatically invalidated after credit reconciliation events.
Error Responses
401
Missing or invalid authentication
500
Internal server error
Source Tracking
Every credit transaction is tagged with a source field that identifies where the request originated. This enables per-channel analytics and is visible in the admin usage dashboard.
web
Request from the browser-based web application (Firebase Auth + App Check).
Default for /chat requests
api_key
Request from the VSCode extension, CLI, or any client using an X-Api-Key header.
/code endpoint when auth_method == "api_key"
agent
Request from the agent executor service (Cloud Tasks / scheduled agent runs).
/internal/agent/code endpoint
Each source can also carry a source_id for more granular attribution:
agentsource includes thetask_idof the agent executionapi_keysource includes the API key hash for per-key usage tracking
Admin Endpoints
The following endpoints require the authenticated user to have the admin role set in their Firestore user document. Admin endpoints are accessible only via Firebase Auth (not API key).
GET /admin/usage
Returns platform-wide analytics including total requests, total credits consumed, active user counts, daily trends, and top models by usage.
Authentication
Requires Firebase Auth with admin role (Authorization + X-Firebase-AppCheck headers).
Request Example
Response
Response Fields
period
string
The billing period in YYYY-MM format.
total_requests
integer
Total API requests across all users in the period.
total_credits_consumed
integer
Total credits consumed across all users in the period.
active_users
integer
Number of unique users who made at least one request.
requests_by_day
array
Daily breakdown with date, requests, and credits.
top_models
array
Models ranked by request count with model, requests, and credits.
Error Responses
401
Missing or invalid authentication
403
User does not have admin role
500
Internal server error
GET /admin/model-economics
Returns per-model cost versus revenue breakdown, showing the platform's margin on each model. Data is sourced from BigQuery usage events and provider cost tables.
Authentication
Requires Firebase Auth with admin role (Authorization + X-Firebase-AppCheck headers).
Request Example
Response
Model Economics Object
model
string
Model identifier.
provider
string
Upstream provider name (anthropic, openai, vertex, openrouter).
total_requests
integer
Total requests to this model in the period.
total_input_tokens
integer
Total input tokens processed.
total_output_tokens
integer
Total output tokens generated.
provider_cost_usd
float
Total cost paid to the upstream provider in USD.
revenue_credits
integer
Total credits charged to users.
revenue_usd
float
Total revenue in USD (credits / 1000).
margin_pct
float
Profit margin percentage: (revenue - cost) / revenue * 100.
How Margins Work
The platform applies a 1.4x margin to raw provider costs for all text-based models, with an additional 1.15x multiplier for image generation models. The margin_pct field in the response reflects the realized margin after accounting for actual token usage patterns.
Credits are calculated as:
Where input_price and output_price are the provider's per-token costs in USD. For image models, a flat per-image credit cost is applied instead of per-token pricing.
Error Responses
401
Missing or invalid authentication
403
User does not have admin role
500
Internal server error
Last updated
