Provider Routing | Proxyify Docs
Docs Provider Routing

Provider Routing

Control which AI infrastructure providers handle your request — by price, latency, throughput, or an explicit allowlist. Proxyify forwards the provider object to its upstream routing layer unchanged — all provider selection logic runs there.

The provider object

Add a provider key to any text or image request body:

json
{ "model": "meta-llama/llama-3.3-70b-instruct", "messages": [{ "role": "user", "content": "Hello" }], "provider": { "sort": "latency", "order": ["fireworks", "together"], "allow_fallbacks": true, "allow": ["fireworks", "together", "deepinfra"], "ignore": ["openai"], "quantizations": ["fp8", "fp16"], "data_collection": "deny" } }
FieldTypeDescription
sortstring"price" / "throughput" / "latency" — disables load-balancing and sorts by the chosen metric.
orderlist[string]Try these providers in order before falling back to others. Example: ["openai", "azure"]
allow_fallbacksboolDefault true. Set false to only use providers in order — returns 503 if none available.
allowlist[string]Allowlist — only these providers will be used.
ignorelist[string]Blocklist — these providers will never be used.
quantizationslist[string]Accepted quantization levels: int4, int8, fp6, fp8, fp16, bf16.
data_collectionstring"deny" — prefer zero-data-retention (ZDR) providers. "allow" (default) — no restriction.

Model slug shortcuts

Append a suffix directly to the model slug instead of writing a full provider object:

SuffixEquivalent to
:nitroprovider.sort = "throughput" — fastest provider
:floorprovider.sort = "price" — cheapest provider
json (nitro shortcut)
{ "model": "meta-llama/llama-3.3-70b-instruct:nitro", "messages": [{ "role": "user", "content": "Hello" }] }

Fallback model list

Use the models array to define a fallback chain. If the primary model fails, each model in models is tried in order:

json
{ "model": "openai/gpt-4o", "messages": [{ "role": "user", "content": "Hello" }], "models": ["anthropic/claude-3.5-sonnet", "google/gemini-pro"], "route": "fallback" }

Plugins

Extend model capabilities with upstream plugins. Pass a plugins array in the request:

json
{ "model": "openai/gpt-4o", "messages": [{ "role": "user", "content": "Summarise today's AI news" }], "plugins": [ { "id": "web" }, { "id": "response-healing" } ] }
Plugin IDDescription
webReal-time web search — model can fetch current information.
file-parserPDF and document processing.
response-healingAutomatically repairs malformed JSON responses.
context-compressionCompresses middle portions of long contexts to reduce token usage.

End-user ID

Pass a stable identifier for the end-user in the user field. It is used for abuse detection upstream — Proxyify does not log or store it.

json
{ "model": "openai/gpt-4o", "messages": [{ "role": "user", "content": "Hello" }], "user": "user_abc123" }