Streaming | Proxyify Docs
Docs Streaming

Streaming

Stream tokens in real-time using Server-Sent Events (SSE). Works with any model that supports streaming.

Enable streaming

Set "stream": true in your request body. The response will be a stream of data: events instead of a single JSON object.

json (request)
{ "model": "openai/gpt-4o", "messages": [{ "role": "user", "content": "Write a short poem." }], "stream": true }

SSE format

The response is a sequence of data: lines, one per token chunk, terminated by data: [DONE]:

sse stream
data: {"choices":[{"delta":{"role":"assistant","content":""},"index":0}]} data: {"choices":[{"delta":{"content":"Roses"},"index":0}]} data: {"choices":[{"delta":{"content":" are"},"index":0}]} data: {"choices":[{"delta":{"content":" red"},"index":0}]} data: {"choices":[{"delta":{},"finish_reason":"stop","index":0}]} data: [DONE]

Delta objects

Each chunk contains a delta object with only the new content for that chunk:

  • First chunk: {"role": "assistant", "content": ""}
  • Content chunks: {"content": "token"}
  • Final chunk: {"finish_reason": "stop"} (or "error" on mid-stream failure)

Accumulate delta.content values across all chunks to reconstruct the full response.

Mid-stream errors

If the provider errors after tokens have already been sent, the HTTP status stays 200. The stream ends with a final chunk containing finish_reason: "error". Credits are charged for the tokens that were generated.

Always check finish_reason on the final delta. If it is "error", the response is incomplete and partial credits were still deducted.

Code examples

Python
from openai import OpenAI client = OpenAI( api_key="prx-xxxxxxxxxxxxxxxx", base_url="https://proxyify.dev/v1", ) stream = client.chat.completions.create( model="openai/gpt-4o", messages=[{"role": "user", "content": "Write a short poem."}], stream=True, ) for chunk in stream: delta = chunk.choices[0].delta if delta.content: print(delta.content, end="", flush=True)
JavaScript / Node
import OpenAI from "openai"; const client = new OpenAI({ apiKey: "prx-xxxxxxxxxxxxxxxx", baseURL: "https://proxyify.dev/v1", }); const stream = await client.chat.completions.create({ model: "openai/gpt-4o", messages: [{ role: "user", content: "Write a short poem." }], stream: true, }); for await (const chunk of stream) { const content = chunk.choices[0]?.delta?.content ?? ""; process.stdout.write(content); }
curl
curl https://proxyify.dev/v1/chat/completions \ -H "Authorization: Bearer prx-xxxxxxxxxxxxxxxx" \ -H "Content-Type: application/json" \ -d '{"model":"openai/gpt-4o","messages":[{"role":"user","content":"Write a short poem."}],"stream":true}' \ --no-buffer