Open Source

The Majestix AI Inference Hub provides access to six models through the OpenRouter relay, spanning open-source and third-party providers. These models offer a range of specializations -- from budget-friendly general use to massive context windows to dedicated reasoning and coding -- and are often the most cost-effective options in the catalog.

Available Models

Key
Provider / Origin
Underlying Model
Context Window
Category

llama-4-maverick

Meta

Llama 4 Maverick

1M

Large Context / Open Source

deepseek-v3.2

DeepSeek

DeepSeek V3.2

164K

Budget / General

deepseek-r1

DeepSeek

DeepSeek R1

64K

Reasoning

qwen3-coder

Alibaba

Qwen3 Coder

262K

Coding / Agentic

kimi-k2.5

Moonshot

Kimi K2.5

262K

Versatile / General

grok-4.1-fast

xAI

Grok 4.1 Fast

2M

Massive Context

llama-4-maverick

Llama 4 Maverick is Meta's large-context open-source model. With a 1M token context window, it matches the Gemini models in input capacity while offering the cost advantages typical of open-source models served through OpenRouter.

When to use: Large-document processing, codebase analysis, research tasks involving many sources, and workloads where you want a high-capacity context window at a lower price point than major-provider models.

When to consider alternatives: For the absolute largest context window, grok-4.1-fast offers 2M tokens. For reasoning-heavy tasks on shorter inputs, deepseek-r1 or claude-opus may produce better results.

deepseek-v3.2

DeepSeek V3.2 is the cheapest model in the Majestix AI Inference Hub catalog. Despite its low cost, it delivers competent performance across general-purpose tasks including conversation, summarization, and light analysis.

When to use: Budget-sensitive workloads, high-volume batch processing, prototyping and experimentation, simple Q&A, and any scenario where minimizing credit spend is the primary concern.

When to consider alternatives: For tasks requiring higher-quality reasoning, nuanced writing, or complex code generation, step up to a mid-tier model like claude-sonnet or gpt-5.2. DeepSeek V3.2 trades quality headroom for cost efficiency.

deepseek-r1

DeepSeek R1 is a reasoning-specialized model. It uses chain-of-thought processing to work through complex problems methodically, making it one of the strongest reasoning options on the platform -- particularly given its price point.

When to use: Math problems, logical puzzles, multi-step analytical tasks, scientific reasoning, and any problem where showing the work matters. It is a cost-effective alternative to claude-opus for reasoning tasks.

When to consider alternatives: R1's chain-of-thought style can produce verbose responses for simple tasks. For quick answers to straightforward questions, use deepseek-v3.2 or claude-haiku. For reasoning tasks that also require a very large context window, consider gemini-3.1-pro.

qwen3-coder

Qwen3 Coder is a coding-specialized model from Alibaba's Qwen family. It is optimized for code generation, refactoring, debugging, and agentic tool-use workflows, with a 262K token context window that accommodates substantial codebases.

When to use: Code generation, code review, debugging, writing tests, refactoring, and agentic workflows that involve structured tool calls. Its 262K context window can hold significant portions of a codebase for context-aware coding assistance.

When to consider alternatives: For coding tasks where you need the highest-quality output regardless of cost, gpt-5.2-codex is OpenAI's flagship coding model with a 400K context window. For non-coding tasks, a general-purpose model will perform better.

kimi-k2.5

Kimi K2.5 is a versatile general-purpose model from Moonshot AI. It offers strong all-around performance with a 262K token context window, positioning it as a capable mid-tier option that handles a wide range of tasks competently.

When to use: General conversation, content generation, analysis, summarization, and any workload that benefits from a solid generalist model with a large context window. It provides a good balance of quality, speed, and cost.

When to consider alternatives: For tasks that demand the absolute best quality, flagship models like claude-sonnet, gpt-5.2, or claude-opus will outperform it. For purely budget-driven workloads, deepseek-v3.2 is cheaper.

grok-4.1-fast

Grok 4.1 Fast from xAI offers the largest context window in the entire Majestix AI Inference Hub catalog at 2M tokens. This is double the Gemini models' 1M context and five times the GPT-5 family's 400K context.

When to use: Tasks that require ingesting extremely large inputs -- entire multi-file codebases, collections of long documents, full books with extensive annotations, or any workload where other models' context windows are insufficient. Also suitable as a fast general-purpose model when massive context is needed.

When to consider alternatives: If your input fits within 1M tokens, Gemini models offer comparable context capacity and may provide better quality for reasoning tasks. For inputs under 400K tokens, the full range of GPT-5 and Claude models becomes available, offering broader model selection.

Context Window Comparison

Model
Context Window

grok-4.1-fast

2M

llama-4-maverick

1M

qwen3-coder

262K

kimi-k2.5

262K

deepseek-v3.2

164K

deepseek-r1

64K

Pricing Tier

OpenRouter models are generally the most cost-effective options on the platform. DeepSeek V3.2 is the cheapest model available. Llama 4 Maverick, Qwen3 Coder, and Kimi K2.5 sit at moderate price points. DeepSeek R1 and Grok 4.1 Fast are priced slightly higher due to their specialized capabilities but remain competitive with major-provider alternatives.

A Note on OpenRouter

These models are served through OpenRouter, a relay service that provides unified API access to models from multiple providers. The Majestix AI Inference Hub handles all OpenRouter integration transparently -- you simply specify the model key in your request and the platform routes it appropriately. There is no difference in the request or response format compared to models from other providers.

Last updated