Provider Routing
Control which AI infrastructure providers handle your request — by price, latency, throughput, or an explicit allowlist. Proxyify forwards the provider object to its upstream routing layer unchanged — all provider selection logic runs there.
The provider object
Add a provider key to any text or image request body:
{
"model": "meta-llama/llama-3.3-70b-instruct",
"messages": [{ "role": "user", "content": "Hello" }],
"provider": {
"sort": "latency",
"order": ["fireworks", "together"],
"allow_fallbacks": true,
"allow": ["fireworks", "together", "deepinfra"],
"ignore": ["openai"],
"quantizations": ["fp8", "fp16"],
"data_collection": "deny"
}
}
| Field | Type | Description |
|---|---|---|
sort | string | "price" / "throughput" / "latency" — disables load-balancing and sorts by the chosen metric. |
order | list[string] | Try these providers in order before falling back to others. Example: ["openai", "azure"] |
allow_fallbacks | bool | Default true. Set false to only use providers in order — returns 503 if none available. |
allow | list[string] | Allowlist — only these providers will be used. |
ignore | list[string] | Blocklist — these providers will never be used. |
quantizations | list[string] | Accepted quantization levels: int4, int8, fp6, fp8, fp16, bf16. |
data_collection | string | "deny" — prefer zero-data-retention (ZDR) providers. "allow" (default) — no restriction. |
Model slug shortcuts
Append a suffix directly to the model slug instead of writing a full provider object:
| Suffix | Equivalent to |
|---|---|
:nitro | provider.sort = "throughput" — fastest provider |
:floor | provider.sort = "price" — cheapest provider |
{
"model": "meta-llama/llama-3.3-70b-instruct:nitro",
"messages": [{ "role": "user", "content": "Hello" }]
}
Fallback model list
Use the models array to define a fallback chain. If the primary model fails, each model in models is tried in order:
{
"model": "openai/gpt-4o",
"messages": [{ "role": "user", "content": "Hello" }],
"models": ["anthropic/claude-3.5-sonnet", "google/gemini-pro"],
"route": "fallback"
}
Plugins
Extend model capabilities with upstream plugins. Pass a plugins array in the request:
{
"model": "openai/gpt-4o",
"messages": [{ "role": "user", "content": "Summarise today's AI news" }],
"plugins": [
{ "id": "web" },
{ "id": "response-healing" }
]
}
| Plugin ID | Description |
|---|---|
web | Real-time web search — model can fetch current information. |
file-parser | PDF and document processing. |
response-healing | Automatically repairs malformed JSON responses. |
context-compression | Compresses middle portions of long contexts to reduce token usage. |
End-user ID
Pass a stable identifier for the end-user in the user field. It is used for abuse detection upstream — Proxyify does not log or store it.
{
"model": "openai/gpt-4o",
"messages": [{ "role": "user", "content": "Hello" }],
"user": "user_abc123"
}