Provider Routing

Control which AI infrastructure providers handle your request — by price, latency, throughput, or an explicit allowlist. Proxyify forwards the provider object to its upstream routing layer unchanged — all provider selection logic runs there.

The `provider` object

Add a provider key to any text or image request body:

json

{
  "model": "meta-llama/llama-3.3-70b-instruct",
  "messages": [{ "role": "user", "content": "Hello" }],
  "provider": {
    "sort": "latency",
    "order": ["fireworks", "together"],
    "allow_fallbacks": true,
    "allow": ["fireworks", "together", "deepinfra"],
    "ignore": ["openai"],
    "quantizations": ["fp8", "fp16"],
    "data_collection": "deny"
  }
}

Field	Type	Description
`sort`	string	`"price"` / `"throughput"` / `"latency"` — disables load-balancing and sorts by the chosen metric.
`order`	list[string]	Try these providers in order before falling back to others. Example: `["openai", "azure"]`
`allow_fallbacks`	bool	Default `true`. Set `false` to only use providers in `order` — returns 503 if none available.
`allow`	list[string]	Allowlist — only these providers will be used.
`ignore`	list[string]	Blocklist — these providers will never be used.
`quantizations`	list[string]	Accepted quantization levels: `int4`, `int8`, `fp6`, `fp8`, `fp16`, `bf16`.
`data_collection`	string	`"deny"` — prefer zero-data-retention (ZDR) providers. `"allow"` (default) — no restriction.

Model slug shortcuts

Append a suffix directly to the model slug instead of writing a full provider object:

Suffix	Equivalent to
`:nitro`	`provider.sort = "throughput"` — fastest provider
`:floor`	`provider.sort = "price"` — cheapest provider

json (nitro shortcut)

{
  "model": "meta-llama/llama-3.3-70b-instruct:nitro",
  "messages": [{ "role": "user", "content": "Hello" }]
}

Fallback model list

Use the models array to define a fallback chain. If the primary model fails, each model in models is tried in order:

json

{
  "model": "openai/gpt-4o",
  "messages": [{ "role": "user", "content": "Hello" }],
  "models": ["anthropic/claude-3.5-sonnet", "google/gemini-pro"],
  "route": "fallback"
}

Plugins

Extend model capabilities with upstream plugins. Pass a plugins array in the request:

json

{
  "model": "openai/gpt-4o",
  "messages": [{ "role": "user", "content": "Summarise today's AI news" }],
  "plugins": [
    { "id": "web" },
    { "id": "response-healing" }
  ]
}

Plugin ID	Description
`web`	Real-time web search — model can fetch current information.
`file-parser`	PDF and document processing.
`response-healing`	Automatically repairs malformed JSON responses.
`context-compression`	Compresses middle portions of long contexts to reduce token usage.

End-user ID

Pass a stable identifier for the end-user in the user field. It is used for abuse detection upstream — Proxyify does not log or store it.

json

{
  "model": "openai/gpt-4o",
  "messages": [{ "role": "user", "content": "Hello" }],
  "user": "user_abc123"
}

Provider Routing

The provider object

Model slug shortcuts

Fallback model list

Plugins

End-user ID

The `provider` object