Anthropic (Claude)

The Majestix AI Inference Hub provides access to three Claude models from Anthropic. All Claude models support tool use (function calling), vision (image inputs), and streaming. They share a 200K token context window.

Available Models

Key
Underlying Model
Context Window
Max Output
Category

claude-sonnet

Claude Sonnet 4.6

200K

64,000

Balanced

claude-haiku

Claude Haiku 4.5

200K

64,000

Fast / Cheap

claude-opus

Claude Opus 4.6

200K

128,000

Reasoning

claude-sonnet

Claude Sonnet 4.6 is the recommended default for most workloads. It delivers strong performance across writing, analysis, code generation, and conversational tasks while keeping costs moderate.

When to use: General-purpose chat, summarization, content generation, moderate code tasks, and any workload where you need a reliable balance of quality and speed.

When to consider alternatives: If your task is trivially simple (use claude-haiku instead) or requires the deepest possible reasoning on a hard problem (use claude-opus instead).

claude-haiku

Claude Haiku 4.5 is the fastest and most cost-effective model in the Claude family. It produces competent responses at significantly lower credit cost and latency compared to Sonnet and Opus.

When to use: High-volume classification, simple Q&A, formatting or transformation tasks, quick lookups, and latency-sensitive pipelines where response time matters more than nuance.

When to consider alternatives: If the task requires detailed analysis, multi-step reasoning, or high-fidelity creative writing, step up to claude-sonnet or claude-opus.

claude-opus

Claude Opus 4.6 is Anthropic's most capable model. It excels at complex multi-step reasoning, nuanced analysis, long-form writing, and problems that benefit from deep deliberation.

When to use: Research synthesis, complex code architecture decisions, legal or financial document analysis, math proofs, and any task where getting the answer right matters more than speed or cost.

When to consider alternatives: For straightforward tasks, Opus is unnecessarily expensive. Use claude-sonnet for everyday work and reserve Opus for problems that genuinely demand its reasoning depth.

Shared Capabilities

All three Claude models support the following:

  • Tool use (function calling): Define tools in your request and the model will produce structured tool call outputs when appropriate.

  • Vision: Pass images (base64 or URL) in the message content array. The model can analyze, describe, and reason about visual inputs.

  • System prompts: Use the system field to set persistent instructions that guide the model's behavior throughout the conversation.

  • Streaming: All responses are delivered via SSE. Partial tokens arrive as chunk events, followed by a done event with usage metadata.

Context Window Considerations

The 200K token context window accommodates approximately 150,000 words of input. This is sufficient for most use cases including long documents, multi-turn conversations, and codebases. If your input consistently exceeds 200K tokens, consider Vertex AI models (Gemini) which offer 1M token contexts, or OpenRouter models like Grok 4.1 Fast which supports up to 2M tokens.

Pricing Tier

Claude models sit in the mid-to-premium pricing range within the platform's credits system. Haiku is the most economical Claude option. Sonnet is moderately priced. Opus carries the highest per-token cost in the catalog. All costs are deducted automatically through the platform's credit reservation system.

Last updated