Security

The Majestix AI Inference Hub implements defense-in-depth security across both GCP projects, covering API authentication, agent execution isolation, credential management, and data protection.

Authentication Layers

The platform supports three authentication methods, selected automatically based on the request context.

Method 1: API Key

Clients: VSCode extension, CLI tools, SDK integrations, any programmatic access.

X-Api-Key: inf_<url_safe_base64>

API key authentication is attempted first when the X-Api-Key header is present.

Property

Detail

Format

inf_ prefix + URL-safe base64 random bytes

Storage

SHA-256 hash only. Raw key never stored.

Verification

Hash key, check Redis cache (15-min TTL), fallback to Firestore on cache miss.

Revocation

Delete from Firestore + invalidate Redis cache entry immediately.

Expiry

90 days by default (configurable).

Rate limiting

Per-user, not per-key.

Security properties:

Keys are SHA-256 hashed before storage -- a database breach does not expose usable keys.
Failed lookups (key not found) are not cached to prevent cache poisoning, where an attacker could force a "not found" entry into cache for a valid key.
The plaintext key is shown to the user exactly once at creation time and is never retrievable again.
Revoking a key both deletes the Firestore document and evicts the Redis cache entry, ensuring immediate effect.

Method 2: Firebase Auth + App Check (Web)

Clients: Browser-based web applications only.

Authorization: Bearer <firebase_id_token>
X-Firebase-AppCheck: <app_check_token>

Both headers are required. The ID token authenticates the user; the App Check token verifies the request originates from a legitimate web app instance.

Property

Detail

ID Token

Verified via Firebase Admin SDK. Contains uid, email, and custom claims.

App Check

reCAPTCHA Enterprise provider. Validates the client is a real browser, not a script or bot.

Verification

App Check token verified first, then ID token. Both must pass.

Method 3: OIDC (Service-to-Service)

Clients: Agent executor, Cloud Tasks, internal services.

Authorization: Bearer <google_oidc_token>

OIDC authentication is used exclusively for service-to-service communication. It is not available to end users.

Property

Detail

Token

Google-signed OIDC token with audience matching the target service URL.

Verification

Token signature validated against Google's public key set.

Service account allowlist

Only authorized SAs accepted (configured in INTERNAL_ALLOWED_SAS).

User context

user_id passed in the request body to charge credits to the correct user.

Rate limiting

OIDC-authenticated services are exempt from IP rate limits.

The verify_user() function in app/auth/firebase_user.py implements the dual-path logic for API key and Firebase auth. OIDC verification is handled separately by verify_internal_caller().

Rate Limiting

Per-IP Rate Limiting

All non-OIDC requests are subject to per-IP rate limiting. The limit is configurable via RATE_LIMIT_PER_MINUTE (default: 30 requests/minute). Rate limit state is stored in Redis for sub-millisecond enforcement.

Per-User Concurrent Request Limiting

Each user has a maximum number of concurrent in-flight requests. This prevents a single user from monopolizing server capacity.

IP Lockout

Repeated authentication failures trigger an IP-level lockout:

Threshold

Lockout

10 failed auth attempts in 5 minutes

15-minute IP block

The lockout applies to all authentication methods. After the lockout period, the IP is automatically unblocked. Lockout state is stored in Redis.

This mechanism protects against:

Brute-force API key guessing
Credential stuffing attacks against Firebase Auth
Automated scanning of the API surface

App Check (reCAPTCHA Enterprise)

Web clients are required to include a Firebase App Check token on every request. App Check uses reCAPTCHA Enterprise as its attestation provider.

Request from browser:
  1. reCAPTCHA Enterprise issues a site-specific token
  2. Firebase App Check wraps it in an App Check token
  3. Backend verifies: getAppCheck().verifyToken(token)
  4. Invalid token -> 403 Forbidden

App Check prevents:

Automated scripts calling the API with stolen Firebase credentials
API abuse from unauthorized web applications
Token replay from non-browser environments

CORS configuration reinforces this by allowing requests only from the two Firebase Hosting origins (inference-web.web.app and inference-web2.web.app).

Agent Security (8-Layer Defense-in-Depth)

The agent executor implements an 8-layer defense model to prevent prompt injection, credential exfiltration, infrastructure abuse, and runaway costs. All security logic resides in security.py in the agent executor.

For detailed agent-specific security documentation, see Agent Security.

+-----------------------------------------------------------------------+
|  Layer 0:  GCP Project Isolation                                      |
|  Agent executor runs in inference-agents, separate from inference-     |
|  platform. Separate IAM, service accounts, audit logs.                |
+-----------------------------------------------------------------------+
|  Layer 0b: Prompt Security Scanner                                    |
|  Blocks prompts that attempt infrastructure provisioning (GCP/AWS/    |
|  Azure), cryptocurrency mining, reverse shells, data exfiltration.    |
+-----------------------------------------------------------------------+
|  Layer 1:  URL Whitelist (Default-Deny)                               |
|  Outbound HTTP only allowed to domains registered in platform_        |
|  integrations. Every URL validated before connection.                  |
+-----------------------------------------------------------------------+
|  Layer 2:  Private IP Blocking                                        |
|  Blocks RFC 1918, link-local (169.254.x.x), loopback (127.x.x.x),   |
|  IPv6 ULA (fc00::/7), IPv6 link-local (fe80::/10). Prevents SSRF.   |
+-----------------------------------------------------------------------+
|  Layer 3:  Credential Exfiltration Prevention                         |
|  Scans all outbound request bodies and headers for plaintext,         |
|  base64-encoded, and URL-encoded credential material.                 |
+-----------------------------------------------------------------------+
|  Layer 4:  Rate Limiting                                              |
|  Per-execution + per-minute API call caps. Response size limits.      |
|  Plan-based thresholds (Guru vs Pro).                                 |
+-----------------------------------------------------------------------+
|  Layer 5:  Response Sanitization                                      |
|  External API responses wrapped in <tool_response> tags with          |
|  safety preamble. Strips internal details.                            |
+-----------------------------------------------------------------------+
|  Layer 6:  Content Moderation                                         |
|  Regex-based pattern detection on task creation and updates.          |
|  Blocks infrastructure abuse, mining, and exfiltration patterns.      |
+-----------------------------------------------------------------------+
|  Layer 7:  Admin Kill Switch                                          |
|  Deprecating a platform_integration immediately revokes all agent     |
|  access to that service's domains. Hourly registry refresh.           |
+-----------------------------------------------------------------------+

Layer 0 -- GCP Project Isolation

The agent executor runs in inference-agents, a completely separate GCP project from inference-platform. This provides:

Separate IAM policies and service accounts
Independent audit logs (Cloud Audit Logs)
No access to Redis Memorystore, billing data, or provider API keys
Blast radius containment if an agent is tricked into malicious behavior

Layer 0b -- Prompt Security Scanner

Before any LLM call, the executor scans task prompts for patterns indicating abuse:

Cloud infrastructure provisioning (terraform, gcloud, aws, az)
Cryptocurrency mining instructions
Reverse shell commands
Data exfiltration instructions
Attempts to override safety preambles or system instructions
References to cloud metadata endpoints (169.254.169.254)

Flagged prompts are rejected with status blocked and reason content_moderation. No LLM call is made.

Layer 1 -- URL Whitelist (Default-Deny)

All outbound HTTP requests from agent tools must match an approved domain pattern in platform_integrations. The whitelist uses scheme + host + path prefix matching.

URLs not matching any approved pattern are rejected
The whitelist is loaded from Firestore and refreshed hourly
Agents cannot modify the whitelist via prompt injection

Layer 2 -- Private IP Blocking

After DNS resolution, the resolved IP address is checked against blocked ranges:

Range

Description

10.0.0.0/8

RFC 1918 private (Class A)

172.16.0.0/12

RFC 1918 private (Class B)

192.168.0.0/16

RFC 1918 private (Class C)

169.254.0.0/16

Link-local (includes cloud metadata at 169.254.169.254)

127.0.0.0/8

Loopback

fc00::/7

IPv6 Unique Local Address

fe80::/10

IPv6 Link-Local

This prevents SSRF attacks where an LLM is tricked into making requests to internal infrastructure, cloud metadata endpoints, or other services on the private network.

Layer 3 -- Credential Exfiltration Prevention

Every outbound HTTP request is scanned for credential material in three encodings:

Plaintext -- raw credential strings in headers or body
Base64-encoded -- base64 encoding of credential strings
URL-encoded -- percent-encoded credential strings

If any credential material is detected in outbound traffic, the request is blocked. This prevents prompt injection attacks that attempt to exfiltrate decrypted credentials by encoding them in API call bodies.

Layer 4 -- Rate Limiting

Per-plan enforcement during agent execution:

Limit

Guru

Pro

Max iterations per execution

Max credits per execution

100

500

Execution timeout

10 min

30 min

API calls per execution

API calls per minute

Max response size

500 KB

2 MB

Layer 5 -- Response Sanitization

External API responses are wrapped in <tool_response> XML tags with a safety preamble that instructs the LLM to treat the content as untrusted data. Internal details (service account names, internal IPs, KMS key paths) are stripped from all responses.

Layer 6 -- Content Moderation

Regex-based pattern detection runs on task creation and update. This catches abuse before execution:

Infrastructure provisioning commands
Mining instructions
Credential theft patterns
Safety override attempts

Layer 7 -- Admin Kill Switch

Setting a platform_integration status to deprecated immediately revokes all agent access to that service's domains. The integration registry is refreshed hourly from Firestore and on executor startup.

Credential Encryption (Cloud KMS)

External service credentials (third-party API keys, OAuth tokens) are encrypted at rest using Cloud KMS with a split-trust service account model.

User stores API key via Cloud Function
    |
    v  KMS encrypt (Bridge SA: encrypt-only)
Firestore: agent_tasks/{id}/credentials/{key}
    |  (encrypted blob at rest)
    |
    v  KMS decrypt (Executor SA: decrypt-only)
Decrypted in memory during execution
    |
    v  Injected into HTTP Authorization header only
External API call
    |
    v  Zeroed from memory after request completes

Split Trust

Role

Service Account

Permission

Project

Bridge (encrypt)

[email protected]

cloudkms.cryptoKeyVersions.useToEncrypt

inference-platform

Executor (decrypt)

[email protected]

cloudkms.cryptoKeyVersions.useToDecrypt

inference-agents

Bridge SA can encrypt but cannot decrypt
Executor SA can decrypt but cannot encrypt
Credentials are never stored in plaintext in Firestore
Decrypted credentials are held in memory only for the duration of the HTTP request
Credential leak scanning (Layer 3) prevents exfiltration even if the LLM is compromised

DNS Pinning

All outbound HTTP requests from agent tools use DNS pinning to prevent TOCTOU (Time-of-Check-Time-of-Use) attacks:

1. URL passes whitelist validation
2. DNS resolution: hostname -> IP address
3. IP validation: check against private IP block list (Layer 2)
4. URL rewrite: replace hostname with resolved IP
5. Set Host header and SNI to original hostname
6. Make HTTPS connection to resolved IP

This prevents DNS rebinding attacks where:

A hostname resolves to a whitelisted public IP at validation time
DNS is changed to point to 169.254.169.254 (cloud metadata) or 127.0.0.1 (loopback) before the connection is established

By pinning the IP at resolution time and rewriting the URL, the connection always goes to the exact IP that was validated.

Session Encryption (Fernet)

Conversation sessions stored in Redis are encrypted using Fernet symmetric encryption (from the cryptography library). Fernet provides authenticated encryption: data is both encrypted (AES-128-CBC) and authenticated (HMAC-SHA256).

Property

Detail

Algorithm

AES-128-CBC + HMAC-SHA256 (Fernet)

Key management

REDIS_ENCRYPTION_KEY environment variable

Scope

Session metadata and message history

Key format

session:{id}:meta and session:{id}:messages

TTL

Configurable (default 1 hour)

Data flow:

User sends message
    |
    v
Serialize message -> Fernet.encrypt(plaintext) -> Store in Redis
    |
    v
Load session: Redis -> Fernet.decrypt(ciphertext) -> Deserialize

Encryption ensures that even if Redis is compromised (e.g., via a vulnerability in the VPC connector), conversation content remains protected.

Security Headers

All API responses include hardened security headers:

Header

Value

Purpose

Strict-Transport-Security

max-age=31536000; includeSubDomains

Enforce HTTPS

Content-Security-Policy

Restrictive policy

Prevent XSS and data injection

X-Frame-Options

DENY

Prevent clickjacking

X-Content-Type-Options

nosniff

Prevent MIME sniffing

X-XSS-Protection

1; mode=block

Legacy XSS filter

CORS

CORS is configured to allow requests only from the two Firebase Hosting origins:

https://inference-web.web.app
https://inference-web.firebaseapp.com
https://inference-web2.web.app
https://inference-web2.firebaseapp.com

Preflight requests are cached. API key-authenticated requests bypass CORS since they originate from non-browser clients.

Error Sanitization

All error responses across both services are sanitized to prevent information leakage:

Sanitized Element

Replacement

Service account emails

[REDACTED_SA]

Stack traces

Generic error code + correlation ID

Firestore document paths

Opaque error reference

KMS key names and versions

[REDACTED_KEY]

Internal IP addresses

[REDACTED_IP]

Environment variable names

[REDACTED_ENV]

Provider API keys in errors

[REDACTED_CREDENTIAL]

Clients receive a correlation ID with every error response. Support teams use this ID to look up the full, unsanitized error in Cloud Logging.

Data Protection Summary

Data Store

Encryption at Rest

Encryption in Transit

Access Control

Firestore

Google-managed

TLS

IAM per project

Redis Memorystore

Fernet (application-level)

In-transit encryption

VPC connector (private)

Cloud KMS

Customer-managed keys

TLS

Split SA (encrypt/decrypt)

BigQuery

Google-managed

TLS

IAM + service account

Artifact Registry

Google-managed

TLS

IAM per project

Compliance Summary

Control

Implementation

Authentication

3 methods: API key (SHA-256), Firebase App Check (reCAPTCHA Enterprise), OIDC

Authorization

Firestore role-based (user, admin)

Transport encryption

HTTPS enforced (HSTS)

Data at rest encryption

Google-managed (Firestore, BigQuery) + Fernet (Redis) + KMS (credentials)

Credential storage

SHA-256 hashed API keys, KMS-encrypted service credentials

Rate limiting

Per-IP with progressive lockout (10 failures / 5 min = 15 min block)

Input validation

Prompt scanning, URL whitelist, DNS pinning, private IP blocking

Output sanitization

Credential stripping, stack trace removal, IP redaction

Audit logging

Cloud Logging (structured JSON) + BigQuery (analytics)

Project isolation

Separate GCP projects for API and executor

Supply chain

--require-hashes for all dependencies

Architecture Overview -- system design and service topology
Infrastructure -- GCP services and deployment pipeline
Authentication -- client-facing auth documentation
Agent Security -- detailed agent security documentation
Orchestration -- orchestration-specific security context

PreviousInfrastructure Next❱ VSCode Extension

Last updated 1 hour ago

Good morning

hashtagAuthentication Layers

hashtagMethod 1: API Key

hashtagMethod 2: Firebase Auth + App Check (Web)

hashtagMethod 3: OIDC (Service-to-Service)

hashtagRate Limiting

hashtagPer-IP Rate Limiting

hashtagPer-User Concurrent Request Limiting

hashtagIP Lockout

hashtagApp Check (reCAPTCHA Enterprise)

hashtagAgent Security (8-Layer Defense-in-Depth)

hashtagLayer 0 -- GCP Project Isolation

hashtagLayer 0b -- Prompt Security Scanner

hashtagLayer 1 -- URL Whitelist (Default-Deny)

hashtagLayer 2 -- Private IP Blocking

hashtagLayer 3 -- Credential Exfiltration Prevention

hashtagLayer 4 -- Rate Limiting

hashtagLayer 5 -- Response Sanitization

hashtagLayer 6 -- Content Moderation

hashtagLayer 7 -- Admin Kill Switch

hashtagCredential Encryption (Cloud KMS)

hashtagSplit Trust

hashtagDNS Pinning

hashtagSession Encryption (Fernet)

hashtagSecurity Headers

hashtagCORS

hashtagError Sanitization

hashtagData Protection Summary

hashtagCompliance Summary

hashtagRelated

Authentication Layers

Method 1: API Key

Method 2: Firebase Auth + App Check (Web)

Method 3: OIDC (Service-to-Service)

Rate Limiting

Per-IP Rate Limiting

Per-User Concurrent Request Limiting

IP Lockout

App Check (reCAPTCHA Enterprise)

Agent Security (8-Layer Defense-in-Depth)

Layer 0 -- GCP Project Isolation

Layer 0b -- Prompt Security Scanner

Layer 1 -- URL Whitelist (Default-Deny)

Layer 2 -- Private IP Blocking

Layer 3 -- Credential Exfiltration Prevention

Layer 4 -- Rate Limiting

Layer 5 -- Response Sanitization

Layer 6 -- Content Moderation

Layer 7 -- Admin Kill Switch

Credential Encryption (Cloud KMS)

Split Trust

DNS Pinning

Session Encryption (Fernet)

Security Headers

CORS

Error Sanitization

Data Protection Summary

Compliance Summary

Related