Documentation - v3.7.4

OmniRoute Docs

AI gateway for multi-provider LLMs. One endpoint for OpenAI, Anthropic, Gemini, DeepSeek, GitHub Copilot, Claude Code, Cursor, and 100+ more providers.

Open Dashboard Endpoint Page GitHub open_in_new Report Issue

Quick Start

1. Install and run
Run npx omniroute or clone from GitHub and run npm start.
2. Create API key
Go to Endpoint -> Registered Keys. Generate one key per environment.
3. Connect providers
Add provider accounts via OAuth login, API key, or free-tier auto-connect.
4. Set client base URL
Point your IDE or API client to https://<host>/v1. Use provider prefix, for example gh/gpt-5.1-codex.

Features

hub

Multi-Provider Routing

Route requests to 30+ AI providers through a single OpenAI-compatible endpoint. Supports chat, responses, audio, and image APIs.

layers

Combos and Balancing

Create model combos with fallback chains and balancing strategies: round-robin, priority, random, least-used, and cost-optimized.

auto_awesome

Auto-Combo

Automatically create optimized combos based on your connected providers, usage patterns, and model capabilities.

travel_explore

Web Search

Integrated web search with 5 providers (Serper, Brave, Exa, Tavily, Perplexity) including analytics and cost tracking.

bar_chart

Usage and Cost Tracking

Real-time token counting, cost calculation per provider/model, and detailed usage breakdown by API key and account.

analytics

Analytics Dashboard

Visual analytics with charts for requests, tokens, errors, latency, costs, and model popularity over time.

health_and_safety

Health Monitoring

Live health checks, provider status, circuit breaker states, and automatic rate limit detection with exponential backoff.

psychology

Memory System

Persistent conversational memory with extraction, injection, retrieval, and summarization across sessions.

auto_fix_high

Skills Framework

Extensible skill system with built-in and custom skills, sandbox execution, request interception, and context injection.

smart_toy

Agent Communication

Agent Communication Protocol (ACP) registry for managing agent-to-agent workflows and tool orchestration.

terminal

CLI Tools

Manage IDE configurations, export/import backups, discover codex profiles, and configure settings from the dashboard.

shield

Security and Policies

API key authentication, IP filtering, prompt injection guard, domain policies, session management, and audit logging.

Supported Providers

127 providers across three connection types.

Manage Providers

Free Tier

5 providers

if/Qoder AI

qw/Qwen Code

gemini-cli/Gemini CLI

kr/Kiro AI

aq/Amazon Q

OAuth

9 providers

cc/Claude Code

/Antigravity

cx/OpenAI Codex

gh/GitHub Copilot

gitlab-duo/GitLab Duo

cu/Cursor IDE

kmc/Kimi Coding

kc/Kilo Code

cl/Cline

API Key

113 providers

agentrouter/AgentRouter

openrouter/OpenRouter

qianfan/Baidu Qianfan

glm/GLM Coding

glmcn/GLM Coding (China)

glmt/GLM Thinking

bcp/Alibaba Coding Plan

kimi/Kimi

kmca/Kimi Coding (API Key)

minimax/Minimax Coding

minimax-cn/Minimax (China)

crof/CrofAI

alicode/Alibaba

alicode-intl/Alibaba Intl

openai/OpenAI

azure/Azure OpenAI

azure-ai/Azure AI Foundry

bedrock/Amazon Bedrock

watsonx/IBM watsonx.ai Gateway

oci/OCI Generative AI

sap/SAP Generative AI Hub

mdl/Modal

reka/Reka

nlpc/NLP Cloud

runway/Runway

anthropic/Anthropic

gemini/Gemini (Google AI Studio)

ds/DeepSeek

groq/Groq

bb/Blackbox AI

xai/xAI (Grok)

mistral/Mistral

pplx/Perplexity

together/Together AI

fireworks/Fireworks AI

cerebras/Cerebras

cohere/Cohere

nvidia/NVIDIA NIM

nebius/Nebius AI

siliconflow/SiliconFlow

hyp/Hyperbolic

nb/NanoBanana

ollamacloud/Ollama Cloud

hf/HuggingFace

synthetic/Synthetic

kg/Kilo Gateway

vertex/Vertex AI

vp/Vertex AI Partners

zai/Z.AI

opencode-zen/OpenCode Zen

opencode-go/OpenCode Go

ali/Alibaba Cloud (DashScope)

lc/LongCat AI

pol/Pollinations AI

pu/Puter AI

cf/Cloudflare Workers AI

scw/Scaleway AI

deepinfra/DeepInfra

vag/Vercel AI Gateway

lambda/Lambda AI

samba/SambaNova

nscale/nScale

ovh/OVHcloud AI

baseten/Baseten

publicai/PublicAI

moonshot/Moonshot AI

meta/Meta Llama API

v0/v0 (Vercel)

morph/Morph

featherless/Featherless AI

friendli/FriendliAI

llamagate/LlamaGate

heroku/Heroku AI

galadriel/Galadriel

databricks/Databricks

datarobot/DataRobot

clarifai/Clarifai

snowflake/Snowflake Cortex

wandb/Weights & Biases Inference

volcengine/Volcengine

ai21/AI21 Labs

gigachat/GigaChat (Sber)

venice/Venice.ai

codestral/Codestral

upstage/Upstage

maritalk/Maritalk

mimo/Xiaomi MiMo

inet/Inference.net

nanogpt/NanoGPT

predibase/Predibase

bytez/Bytez

aiml/AI/ML API

novita/Novita AI

pi/PiAPI

ggo/GoAPI

lz/LaoZhang AI

glhf/GLHF Chat

cablyai/CablyAI

thebai/TheB.AI

fenayai/FenayAI

empower/Empower

nous/Nous Research

petals/Petals

poe/Poe

gitlab/GitLab Duo PAT

chutes/Chutes.ai

voyage/Voyage AI

jina/Jina AI

fal/Fal.ai

stability/Stability AI

bfl/Black Forest Labs

recraft/Recraft

topaz/Topaz

Common Use Cases

Single endpoint for many providers

Point clients to one base URL and route by model prefix (for example: gh/, cc/, kr/, openai/).

Fallback and model switching with combos

Create combo models in Dashboard and keep client config stable while providers rotate internally.

Usage, cost and debug visibility

Track tokens and cost by provider, account, and API key in Usage and Analytics tabs.

Client Compatibility

Cherry Studio

Base URL: https://<host>/v1
Chat endpoint: /chat/completions
Model recommendation: explicit prefix (gh/..., cc/...)

Codex / GitHub Copilot Models

Use model IDs with gh/.
Codex-family models auto-route to /responses.
Non-Codex models continue on /chat/completions.

Cursor IDE

Use cu/ prefix for Cursor models.
OAuth connection - login from the Providers page.
Supports both chat and responses endpoints.

Claude Code / Antigravity

Use cc/ (Claude) or antigravity/ (Antigravity) prefix.
OAuth connection with automatic token refresh.
Full streaming support for all models.

Windsurf

Use OmniRoute as an OpenAI-compatible base URL and keep explicit provider prefixes for deterministic routing.
Point models to `/v1/chat/completions` for general traffic and preserve `/v1/responses` for Codex-style flows.
Use Dashboard -> CLI Tools for a ready-made Windsurf configuration guide.

Cline

Cline works best with explicit provider/model prefixes so the router never has to guess the backend.
Use `/v1/chat/completions` for general models and reuse the same OmniRoute base URL across different accounts.
Use the Providers dashboard to validate OAuth/API key before debugging Cline runtime issues.

Kimi Coding

Use OmniRoute as a stable base URL while rotating accounts or provider combos underneath.
Prefer prefixed models in coding flows so fallback and audit trail remain explicit.
Use `/v1/responses` when you want native Responses-style routing for tool-using clients.

Protocols: MCP & A2A

OmniRoute exposes two operational protocols in addition to OpenAI-compatible APIs: MCP for tool execution and A2A for agent-to-agent workflows.

MCP (Model Context Protocol)

Use MCP over stdio to let clients discover and call OmniRoute tools with audit visibility.

Start MCP transport with `omniroute --mcp`.
Point your MCP client to stdio transport.
Call `omniroute_get_health` and `omniroute_list_combos` to validate connectivity.

omniroute --mcp

A2A (Agent2Agent)

Use A2A JSON-RPC to submit tasks synchronously or via SSE streaming.

Read `/.well-known/agent.json` for agent discovery.
Send `message/send` or `message/stream` requests to `POST /a2a`.
Manage task lifecycle with `tasks/get` and `tasks/cancel`.

GET /.well-known/agent.json
POST /a2a  (JSON-RPC: message/send | message/stream)

ACP (Agent Communication)

Navigate to Dashboard → Agents to view registered ACP agents.
Register new agents with capabilities and endpoint configuration.
Use CLI Tools to configure agent communication channels.

Dashboard -> Agents
Dashboard -> CLI Tools

Protocol Troubleshooting

If MCP status is offline, verify the stdio process is running and heartbeat file is updating.
If A2A tasks stay in `working`, inspect `/api/a2a/tasks/:id` and stream events for terminal state.
Use `/dashboard/mcp` and `/dashboard/a2a` for operational controls and audit visibility.

MCP Tools

OmniRoute exposes 29 tools via Model Context Protocol for agent orchestration.

29 tools

Routing & Discovery

Health checks, combo management, quota monitoring, cost reporting, and model catalog access.

omniroute_get_healthomniroute_list_combosomniroute_get_combo_metricsomniroute_switch_comboomniroute_check_quotaomniroute_route_requestomniroute_cost_reportomniroute_list_models_catalogomniroute_web_search

Operations & Strategy

Route simulation, budget guards, strategy switching, resilience profiles, and provider metrics.

omniroute_simulate_routeomniroute_set_budget_guardomniroute_set_routing_strategyomniroute_set_resilience_profileomniroute_test_comboomniroute_get_provider_metricsomniroute_best_combo_for_taskomniroute_explain_routeomniroute_get_session_snapshotomniroute_db_health_checkomniroute_sync_pricing

Cache Management

View cache statistics and flush semantic or signature caches.

omniroute_cache_statsomniroute_cache_flush

Memory

Search, add, and clear persistent conversational memory entries.

omniroute_memory_searchomniroute_memory_addomniroute_memory_clear

Skills

List, enable, execute, and monitor custom skill executions.

omniroute_skills_listomniroute_skills_enableomniroute_skills_executeomniroute_skills_executions

API Reference

Method	Path	Notes
`POST`	/v1/chat/completions	OpenAI-compatible chat endpoint (default).
`POST`	/v1/responses	Responses API endpoint (Codex, o-series).
`POST`	/v1/completions	Legacy completions endpoint for text generation.
`GET`	/v1/models	Model catalog for all connected providers.
`POST`	/v1/embeddings	Text embedding generation (OpenAI, Cohere, Voyage).
`POST`	/v1/moderations	Content moderation and safety classification.
`POST`	/v1/rerank	Document reranking for retrieval-augmented generation (Cohere, Jina).
`POST`	/v1/search	Web search with 5 providers (Serper, Brave, Exa, Tavily, Perplexity).
`GET`	/v1/search/analytics	Analytics and metrics for search requests.
`POST`	/v1/audio/transcriptions	Audio transcription (Deepgram, AssemblyAI).
`POST`	/v1/audio/speech	Text-to-speech generation (ElevenLabs, OpenAI TTS).
`POST`	/v1/images/generations	Image generation (NanoBanana).
`POST`	/v1/videos/generations	Video generation (ComfyUI, SD WebUI workflows).
`POST`	/v1/music/generations	Music generation via ComfyUI workflows.
`POST`	/v1/messages	Anthropic-native messages endpoint.
`POST`	/v1/messages/count_tokens	Count tokens for a given message payload.
`POST`	/v1/files	File upload for multimodal inputs.
`POST`	/v1/batches	Batch processing for bulk API requests.
`GET`	/v1/ws	WebSocket endpoint for real-time streaming.
`POST`	/chat/completions	Rewrite helper for clients without /v1.
`POST`	/responses	Rewrite helper for Responses without /v1.
`GET`	/models	Rewrite helper for model discovery without /v1.

Model Prefixes

Use the provider prefix before the model name to route to a specific provider. Example: gh/gpt-5.1-codex routes to GitHub Copilot.

Prefix	Provider	Type
`if/`	Qoder AI	Free Tier
`qw/`	Qwen Code	Free Tier
`gemini-cli/`	Gemini CLI	Free Tier
`kr/`	Kiro AI	Free Tier
`aq/`	Amazon Q	Free Tier
`cc/`	Claude Code	OAuth
`/`	Antigravity	OAuth
`cx/`	OpenAI Codex	OAuth
`gh/`	GitHub Copilot	OAuth
`gitlab-duo/`	GitLab Duo	OAuth
`cu/`	Cursor IDE	OAuth
`kmc/`	Kimi Coding	OAuth
`kc/`	Kilo Code	OAuth
`cl/`	Cline	OAuth
`agentrouter/`	AgentRouter	API Key
`openrouter/`	OpenRouter	API Key
`qianfan/`	Baidu Qianfan	API Key
`glm/`	GLM Coding	API Key
`glmcn/`	GLM Coding (China)	API Key
`glmt/`	GLM Thinking	API Key
`bcp/`	Alibaba Coding Plan	API Key
`kimi/`	Kimi	API Key
`kmca/`	Kimi Coding (API Key)	API Key
`minimax/`	Minimax Coding	API Key
`minimax-cn/`	Minimax (China)	API Key
`crof/`	CrofAI	API Key
`alicode/`	Alibaba	API Key
`alicode-intl/`	Alibaba Intl	API Key
`openai/`	OpenAI	API Key
`azure/`	Azure OpenAI	API Key
`azure-ai/`	Azure AI Foundry	API Key
`bedrock/`	Amazon Bedrock	API Key
`watsonx/`	IBM watsonx.ai Gateway	API Key
`oci/`	OCI Generative AI	API Key
`sap/`	SAP Generative AI Hub	API Key
`mdl/`	Modal	API Key
`reka/`	Reka	API Key
`nlpc/`	NLP Cloud	API Key
`runway/`	Runway	API Key
`anthropic/`	Anthropic	API Key
`gemini/`	Gemini (Google AI Studio)	API Key
`ds/`	DeepSeek	API Key
`groq/`	Groq	API Key
`bb/`	Blackbox AI	API Key
`xai/`	xAI (Grok)	API Key
`mistral/`	Mistral	API Key
`pplx/`	Perplexity	API Key
`together/`	Together AI	API Key
`fireworks/`	Fireworks AI	API Key
`cerebras/`	Cerebras	API Key
`cohere/`	Cohere	API Key
`nvidia/`	NVIDIA NIM	API Key
`nebius/`	Nebius AI	API Key
`siliconflow/`	SiliconFlow	API Key
`hyp/`	Hyperbolic	API Key
`nb/`	NanoBanana	API Key
`ollamacloud/`	Ollama Cloud	API Key
`hf/`	HuggingFace	API Key
`synthetic/`	Synthetic	API Key
`kg/`	Kilo Gateway	API Key
`vertex/`	Vertex AI	API Key
`vp/`	Vertex AI Partners	API Key
`zai/`	Z.AI	API Key
`opencode-zen/`	OpenCode Zen	API Key
`opencode-go/`	OpenCode Go	API Key
`ali/`	Alibaba Cloud (DashScope)	API Key
`lc/`	LongCat AI	API Key
`pol/`	Pollinations AI	API Key
`pu/`	Puter AI	API Key
`cf/`	Cloudflare Workers AI	API Key
`scw/`	Scaleway AI	API Key
`deepinfra/`	DeepInfra	API Key
`vag/`	Vercel AI Gateway	API Key
`lambda/`	Lambda AI	API Key
`samba/`	SambaNova	API Key
`nscale/`	nScale	API Key
`ovh/`	OVHcloud AI	API Key
`baseten/`	Baseten	API Key
`publicai/`	PublicAI	API Key
`moonshot/`	Moonshot AI	API Key
`meta/`	Meta Llama API	API Key
`v0/`	v0 (Vercel)	API Key
`morph/`	Morph	API Key
`featherless/`	Featherless AI	API Key
`friendli/`	FriendliAI	API Key
`llamagate/`	LlamaGate	API Key
`heroku/`	Heroku AI	API Key
`galadriel/`	Galadriel	API Key
`databricks/`	Databricks	API Key
`datarobot/`	DataRobot	API Key
`clarifai/`	Clarifai	API Key
`snowflake/`	Snowflake Cortex	API Key
`wandb/`	Weights & Biases Inference	API Key
`volcengine/`	Volcengine	API Key
`ai21/`	AI21 Labs	API Key
`gigachat/`	GigaChat (Sber)	API Key
`venice/`	Venice.ai	API Key
`codestral/`	Codestral	API Key
`upstage/`	Upstage	API Key
`maritalk/`	Maritalk	API Key
`mimo/`	Xiaomi MiMo	API Key
`inet/`	Inference.net	API Key
`nanogpt/`	NanoGPT	API Key
`predibase/`	Predibase	API Key
`bytez/`	Bytez	API Key
`aiml/`	AI/ML API	API Key
`novita/`	Novita AI	API Key
`pi/`	PiAPI	API Key
`ggo/`	GoAPI	API Key
`lz/`	LaoZhang AI	API Key
`glhf/`	GLHF Chat	API Key
`cablyai/`	CablyAI	API Key
`thebai/`	TheB.AI	API Key
`fenayai/`	FenayAI	API Key
`empower/`	Empower	API Key
`nous/`	Nous Research	API Key
`petals/`	Petals	API Key
`poe/`	Poe	API Key
`gitlab/`	GitLab Duo PAT	API Key
`chutes/`	Chutes.ai	API Key
`voyage/`	Voyage AI	API Key
`jina/`	Jina AI	API Key
`fal/`	Fal.ai	API Key
`stability/`	Stability AI	API Key
`bfl/`	Black Forest Labs	API Key
`recraft/`	Recraft	API Key
`topaz/`	Topaz	API Key

Management API Reference

Automation endpoints for proxy registry, scope assignments, and legacy proxy migration.

Method	Path	Notes
`GET`	/api/providers	List all registered provider connections.
`POST`	/api/providers	Create a new provider connection.
`PUT`	/api/providers/:id	Update an existing provider connection.
`DELETE`	/api/providers/:id	Delete a provider connection.
`POST`	/api/providers/:id/test	Test connectivity and authentication for a provider.
`GET`	/api/providers/:id/models	List available models for a specific provider.
`GET`	/api/settings	Retrieve current application settings.
`PUT`	/api/settings	Update application settings.
`GET`	/api/settings/payload-rules	Get payload transformation rules.
`PUT`	/api/settings/payload-rules	Update payload transformation rules.
`GET`	/api/v1/management/proxies	List saved proxy registry items (supports pagination).
`POST`	/api/v1/management/proxies	Create a reusable proxy item in the registry.
`GET`	/api/v1/management/proxies/health	Get 24h/rolling health metrics per saved proxy from proxy logs.
`PUT`	/api/v1/management/proxies/bulk-assign	Assign or clear one proxy across many scope IDs in one request.
`GET`	/api/v1/management/proxies/assignments	List proxy assignments by scope, scope_id, or proxy_id.
`PUT`	/api/v1/management/proxies/assignments	Assign or clear proxy for global/provider/account/combo scope.
`POST`	/api/settings/proxies/migrate	Import legacy proxyConfig maps into registry assignments.

Troubleshooting

If the client fails with model routing, use explicit provider/model (for example: gh/gpt-5.1-codex).
If you receive ambiguous model errors, pick a provider prefix instead of a bare model ID.
For GitHub Codex-family models, keep model as gh/codex-model; router selects /responses automatically.
Use Dashboard > Providers > Test Connection before testing from IDEs or external clients.
If a provider shows circuit breaker open, wait for the cooldown or check Health page for details.
For OAuth providers, re-authenticate if tokens expire. Check the provider card status indicator.

Loading OmniRoute...