❱ Architecture
System Overview
+----------------------------------------------+
| Clients |
| Web App (React 19) | VSCode Extension |
| API Key (CLI/SDK) | Cloud Tasks (cron) |
+----------+------------------------+----------+
| |
Firebase Auth + OIDC Bearer
App Check / API Key (service account)
| |
+-------------v-----------+ +--------v---------------------+
| inference-api | | agent-executor |
| Cloud Run | | Cloud Run |
| (inference-platform) | | (inference-agents) |
| | | |
| /chat, /code |<--| POST /internal/agent/code |
| /models | | |
| /billing, /usage | | /internal/agent/execute |
| /api-keys | | /internal/agent/ensemble |
| /internal/agent/code | | /internal/agent/swarm |
| | | |
| Credits: reserve -> | | Agentic loop, ensemble, |
| stream -> reconcile | | swarm, tool execution |
+--+----+----+----+------+ +--+----------+----------------+
| | | | | |
+----------+ | | +------+ +--+ |
v v v v v v
+----------+ +----------+ +-----+ +----------+ +----------+
| Anthropic | | OpenAI | |Redis| | Firestore| |Cloud KMS |
| Claude | | GPT-5.x | | | | (both) | |(cred enc)|
+----------+ +----------+ +-----+ +----------+ +----------+
| Vertex AI | |OpenRouter|
| Gemini 3 | |DS/Grok/ | +----------+ +----------+
| | |Qwen/Kimi| | BigQuery | | Pub/Sub |
+----------+ +----------+ |(analytics| |(usage + |
| events) | | audit) |
+----------+ +----------+Two Cloud Run Services
inference-api (Main API)
Responsibility
Detail
agent-executor (Agent Executor)
Responsibility
Detail
Four Providers
Provider
Models
Connection
Key Infrastructure
Service
Purpose
Project
Design Principles
Ship v1 First
Single Credit System
SSE Everywhere
Project-Level Isolation
Reservation-Based Billing
Data Flow
Chat Request (Web or API Key)
Scheduled Agent Execution
Credit Flow
Further Reading
Last updated
