Chat Completions | Proxyify Docs
Docs Chat Completions

Chat Completions

All modalities — text, image, video, STT, TTS — route through a single endpoint. The format follows OpenAI's Chat Completions API.

Endpoint

http
POST https://proxyify.dev/v1/chat/completions

Text (Chat)

json (request)
{ "model": "openai/gpt-4o", "messages": [{ "role": "user", "content": "Hello" }], "stream": false }
json (response)
{ "data": { "choices": [{ "message": { "role": "assistant", "content": "Hi!" } }], "usage": { "prompt_tokens": 10, "completion_tokens": 4, "total_tokens": 14 } }, "_balancer": { "credits_used": 24, "cost_usd": 0.024, "model_used": "openai/gpt-4o", "input_tokens": 120, "output_tokens": 80, "latency_ms": 1240, "cached": false } }

Add "stream": true for SSE streaming. See Streaming guide.

Image generation

json (request)
{ "model": "black-forest-labs/flux-1.1-pro", "messages": [{ "role": "user", "content": "A sunset over mountains" }], "modalities": ["image", "text"], "image_config": { "aspect_ratio": "16:9", // "1:1" (default), "16:9", "9:16" "image_size": "1K" // "1K" (default), "2K", "4K" } }
json (response)
{ "data": { "choices": [{ "message": { "role": "assistant", "images": [{ "type": "image_url", "image_url": { "url": "data:image/png;base64,..." } }] } }] }, "_balancer": { "credits_used": 80, "cost_usd": 0.08, "images_count": 1, "latency_ms": 3200 } }

Video generation (async)

Video generation is asynchronous — submit a job, then poll for the result.

json (submit request)
{ "model": "kling/kling-video-v3-pro", "prompt": "A golden retriever on a sunny beach", "duration": 5, "resolution": "720p", "aspect_ratio": "16:9" }
json (submit response — 202 Accepted)
{ "data": { "id": "abc123", "status": "pending", "polling_url": "/v1/jobs/abc123" }, "_balancer": { "job_id": "abc123", "model_used": "kling/kling-video-v3-pro" } }

Poll the polling_url with the same Authorization header until status is completed:

json (poll response — completed)
{ "data": { "id": "abc123", "status": "completed", "video_url": "https://..." }, "_balancer": { "credits_used": 672, "cost_usd": 0.672, "duration_seconds": 5 } }

Speech-to-Text (STT)

json (request)
{ "model": "openai/whisper-1", "input_audio": { "data": "<base64_encoded_audio>", "format": "wav" }, "language": "en" }
json (response)
{ "data": { "text": "Hello, this is a test." }, "_balancer": { "credits_used": 1, "audio_seconds": 9.2, "model_used": "openai/whisper-1" } }

Supported STT models: openai/whisper-1 · openai/whisper-large-v3 · openai/whisper-large-v3-turbo · openai/gpt-4o-transcribe · openai/gpt-4o-mini-transcribe · google/chirp-3

Text-to-Speech (TTS)

json (request)
{ "model": "openai/gpt-4o-mini-tts", "input": "Hello, this is a TTS test.", "voice": "alloy", "response_format": "mp3" }

TTS responses are raw audio byte streams (Content-Type: audio/mpeg). The _balancer data is in response headers:

http (response headers)
X-Balancer-Credits-Used: 5 X-Balancer-Cost-USD: 0.005 X-Balancer-Model-Used: openai/gpt-4o-mini-tts

Supported TTS models: openai/gpt-4o-mini-tts · google/gemini-3.1-flash-tts-preview · mistralai/voxtral-mini-tts · hexgrad/kokoro-82m · sesame/csm-1b

Errors

All errors follow this format:

json (error response)
{ "error": { "code": 402, "message": "Insufficient credits. Please top up your balance at proxyify.dev/billing.", "metadata": {} } }
StatusMeaning
400Invalid parameters or prompt injection detected
401Missing, invalid, or expired API key
402Insufficient credits or spending limit reached
403Blocked by key restriction (IP, origin, country, model, time)
408Provider timeout
429Rate limit exceeded — check Retry-After header
502Provider returned an error
503No provider available for this model