Model Providers

Roster uses a model during participant resolution to interpret the workflow question, select relevant participants, and produce an auditable answer.

The deployment owns the model choice. Roster records model provider, model name, latency, token usage, and resolution metadata for observability.

Admins can store resolver agent metadata and cost inputs in Settings. Provider credentials and provider family defaults are still deployment runtime configuration.

Provider Model

Roster runs one resolver model provider at a time.

Choose a Provider

Choose the provider value that matches how your deployment reaches the model:

Use case	`ROSTER_MODEL_PROVIDER`	Default API	Model name format
OpenAI	`openai`	`responses`	OpenAI model ID, for example `gpt-5.6-sol`
Mistral	`mistral`	`chat-completions`	Mistral model ID, for example `mistral-large-2512`
Anthropic	`anthropic`	`messages`	Anthropic model ID, for example `claude-opus-4-8`
OpenRouter or another OpenAI-compatible gateway	`openai`	`responses`	Gateway model ID, for example `openai/gpt-5.4-mini`

Provider-specific *_BASE_URL values are optional. Set provider base URLs only when routing through a provider proxy, a regional endpoint, or an approved model gateway. The examples below show the default public provider endpoints so operators know what is used when the variable is omitted.

OpenAI

OpenAI is the default provider:

OPENAI_API_KEY=<openai-api-key>
OPENAI_BASE_URL=https://api.openai.com/v1
ROSTER_MODEL_PROVIDER=openai
ROSTER_MODEL_NAME=gpt-5.6-sol
ROSTER_MODEL_EFFORT=low

Mistral

Mistral is supported through its chat-completions API:

MISTRAL_API_KEY=<mistral-api-key>
MISTRAL_BASE_URL=https://api.mistral.ai/v1
ROSTER_MODEL_PROVIDER=mistral
ROSTER_MODEL_NAME=mistral-large-2512

Anthropic

Anthropic is supported through its Messages API:

ANTHROPIC_API_KEY=<anthropic-api-key>
ANTHROPIC_BASE_URL=https://api.anthropic.com
ROSTER_MODEL_PROVIDER=anthropic
ROSTER_MODEL_NAME=claude-opus-4-8

OpenRouter Gateway

For OpenRouter, keep ROSTER_MODEL_PROVIDER=openai because OpenRouter exposes an OpenAI-compatible API.

For OpenRouter-hosted OpenAI models, use the Responses API:

OPENAI_API_KEY=<openrouter-api-key>
OPENAI_BASE_URL=https://openrouter.ai/api/v1
ROSTER_MODEL_PROVIDER=openai
ROSTER_MODEL_NAME=openai/gpt-5.4-mini
ROSTER_MODEL_EFFORT=low

For gateway-backed third-party models, Chat Completions is the usual starting point because support for the newer Responses API varies by gateway model:

OPENAI_API_KEY=<openrouter-api-key>
OPENAI_BASE_URL=https://openrouter.ai/api/v1
ROSTER_MODEL_PROVIDER=openai
ROSTER_MODEL_API=chat-completions
ROSTER_MODEL_NAME=anthropic/claude-opus-4.8

Kimi K3 is a supported exception: use the Responses API and disable reasoning effort:

OPENAI_API_KEY=<openrouter-api-key>
OPENAI_BASE_URL=https://openrouter.ai/api/v1
ROSTER_MODEL_PROVIDER=openai
ROSTER_MODEL_API=responses
ROSTER_MODEL_NAME=moonshotai/kimi-k3
ROSTER_MODEL_EFFORT=none

DeepSeek V4 Pro is another supported exception. Use the Responses API with high reasoning effort and raise the gateway output ceiling to 8192 tokens. Lower effort settings are not recommended for this integration:

OPENAI_API_KEY=<openrouter-api-key>
OPENAI_BASE_URL=https://openrouter.ai/api/v1
ROSTER_MODEL_PROVIDER=openai
ROSTER_MODEL_API=responses
ROSTER_MODEL_NAME=deepseek/deepseek-v4-pro
ROSTER_MODEL_EFFORT=high
ROSTER_MODEL_OPENAI_RESPONSES_MAX_OUTPUT_TOKENS=8192

When OPENAI_BASE_URL is set, Roster defaults the OpenAI Responses output cap to 2048. Override it only when the gateway, model, or key limit requires a different bound:

ROSTER_MODEL_OPENAI_RESPONSES_MAX_OUTPUT_TOKENS=2048

OpenRouter compatibility depends on the selected gateway model, model API, reasoning effort, key limits, and structured output behavior. Validate each gateway/provider/model pair with your own data before use.

Model API

ROSTER_MODEL_API is optional. When unset, Roster uses the default API for the selected provider:

`ROSTER_MODEL_PROVIDER`	Default `ROSTER_MODEL_API`
`openai`	`responses`
`mistral`	`chat-completions`
`anthropic`	`messages`

Set ROSTER_MODEL_API only when Roster documents another supported API for the same provider. OpenAI-compatible gateways may use chat-completions with ROSTER_MODEL_PROVIDER=openai.

Supported Runtime Providers

Roster is designed for bring-your-own-model deployments. Deployments can standardize on a direct provider or route through an approved model gateway.

Use a direct provider when you want the simplest vendor-specific integration.
Use an approved gateway when you need centralized routing, budgeting, or model access controls.
Treat gateway compatibility tests separately from Resolve quality tests. A gateway model should still pass your Resolve eval suite before becoming a production recommendation.

Supported Resolve Models

The models below are the current supported Resolve recommendations. Validate cost, latency, data-handling requirements, and account access before production rollout.

Provider	Model	API	Recommended use	Operating note
OpenAI	`gpt-5.6-sol`	`responses`	Current quality-first recommendation	Use `ROSTER_MODEL_EFFORT=low`. Current successor to `gpt-5.5` for complex professional work.
OpenAI	`gpt-5.6-terra`	`responses`	Balanced cost and quality option	Use `ROSTER_MODEL_EFFORT=low`. Lower-cost GPT-5.6 option for everyday production workloads.
OpenAI	`gpt-5.6-luna`	`responses`	Cost-sensitive OpenAI recommendation	Use `ROSTER_MODEL_EFFORT=low`. Canary on representative data before high-impact routing.
OpenAI	`gpt-5.5`	`responses`	Previous-generation Full Resolve option	Use `ROSTER_MODEL_EFFORT=low`. Strong accuracy with moderate latency in the Resolve suite.
OpenAI	`gpt-5.4`	`responses`	Previous-generation OpenAI alternative	Use `ROSTER_MODEL_EFFORT=low`. Useful when newer OpenAI models are not required or available.
OpenAI	`gpt-5.4-mini`	`responses`	Previous-generation cost option	Use `ROSTER_MODEL_EFFORT=low`. Canary on representative data before high-impact routing.
Mistral	`mistral-large-2512`	`chat-completions`	Mistral Full Resolve recommendation	Prefer this over smaller Mistral models for complex multi-responsibility prompts.
Anthropic	`claude-opus-4-8`	`messages`	Anthropic quality-first option	Confirm latency and cost fit before production rollout.
Anthropic	`claude-fable-5`	`messages`	Anthropic Full Resolve recommendation	Confirm latency and cost fit; this is a direct Anthropic model, not an OpenRouter recommendation.

Choose a Cost-Sensitive Option

Reasoning effort and model price are separate controls. A lower effort can reduce reasoning-token usage for models that support it, but it does not change the model’s per-token price. Do not lower effort below the configuration that passed the full Resolve suite merely to reduce cost.

Use the lowest-cost Full Resolve option that fits the required provider family:

OpenAI: use gpt-5.4-mini for the lowest current token price; use gpt-5.6-luna when a current-generation model is preferred.
Anthropic: compare current account pricing for claude-fable-5 and claude-opus-4-8; both are supported Full Resolve options.
Mistral: use mistral-large-2512 for complex multi-responsibility prompts.
Kimi through OpenRouter: use moonshotai/kimi-k3 with effort none.
DeepSeek through OpenRouter: use deepseek/deepseek-v4-pro with effort high. V4 Flash is not currently supported as a drop-in Full Resolve substitute.

Provider prices and gateway routing change independently of Roster. Confirm current pricing and run a canary with representative data before rollout.

Do not promote a different model to production until it passes the Resolve eval suite with your expected data shape and guardrails. Other provider-compatible models may work, but use them for testing or canaries until they pass your deployment’s Resolve evals.

OpenRouter Gateway Models

Treat OpenRouter as a gateway configuration for supported model families, not as a separate direct provider recommendation. The following gateway model IDs have passed the current full Resolve suite:

Gateway model	API	Effort
`openai/gpt-5.5`	`responses`	`low`
`openai/gpt-5.4`	`responses`	`low`
`openai/gpt-5.4-mini`	`responses`	`low`
`anthropic/claude-opus-4.8`	`chat-completions`	n/a
`moonshotai/kimi-k3`	`responses`	`none`
`deepseek/deepseek-v4-pro`	`responses`	`high`

For deepseek/deepseek-v4-pro, also set ROSTER_MODEL_OPENAI_RESPONSES_MAX_OUTPUT_TOKENS=8192. The default gateway ceiling of 2048 is not the validated Full Resolve configuration.

Treat OpenRouter model IDs as deployment-specific candidates. A one-case smoke test can prove API compatibility, but the full Resolve suite is the minimum bar for documenting a model as recommended. OpenRouter mistralai/mistral-large-2512 remains a canary because it has not yet passed that full-suite bar; this does not affect the direct Mistral recommendation above. For Kimi K3 and DeepSeek V4 Pro, use the exact API, effort, and output ceiling settings shown above rather than substituting another model or API mode. Canary gateway models on representative data because pricing, routing, and latency can vary by provider and region.

Reasoning Effort

ROSTER_MODEL_EFFORT controls the requested reasoning level for providers that support effort controls. Roster accepts the value for all providers, but direct Mistral and Anthropic requests currently ignore it.

none
minimal
low
medium
high
xhigh
max

Use lower effort for high-volume routing and higher effort for complex, high-impact workflows. Gateway models can interpret effort controls differently from direct OpenAI models. If an OpenAI-compatible gateway model returns empty outputs or unstable structured responses, validate the same model with ROSTER_MODEL_EFFORT=none before promoting or rejecting it.

Production Checklist

Choose the model provider or gateway before exposing resolution to users.
Store model credentials in the deployment secret manager.
Verify structured response behavior before approving a model outside the tested recommendations.
Confirm data residency, retention, and logging requirements with the model provider or gateway.
Monitor model runs for latency, cost, error rate, and resolution quality.

Use Model Runs to inspect recorded invocations and correlate provider/model choices with latency, token usage, cost, and errors.