Skip to main content
POST https://gateway.llm-stats.com/v1/generations
GET  https://gateway.llm-stats.com/v1/generations/{id}
Image and video generation share a single resource. You POST once, the server holds the connection open while the job runs, and you get back a final response with a direct URL to the asset. If the job needs longer than the wait window, you GET the same id with another wait — no client-side polling cadence to manage.

At a glance

  • One resource. POST /v1/generations for every image and video model.
  • Server-side long-polling. wait="auto" (default) blocks for up to 60s — most short jobs return as completed in a single round-trip.
  • Strict input. Unknown fields are rejected, so typos surface immediately.
  • Stable error codes. Branch on error.code, never on error.message.

Quickstart

import requests

# 1. Create the generation. wait="auto" long-polls (up to 60s).
res = requests.post(
    "https://gateway.llm-stats.com/v1/generations",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "model": "flux-1.1-pro",
        "input": {"prompt": "A beautiful sunset over mountains"},
        "wait": "auto",
    },
)
job = res.json()

# 2. If it didn't finish in time, poll the same resource.
while job["status"] in ("queued", "running"):
    res = requests.get(
        f"https://gateway.llm-stats.com/v1/generations/{job['id']}",
        params={"wait": 60},
        headers={"Authorization": "Bearer YOUR_API_KEY"},
    )
    job = res.json()

if job["status"] == "completed":
    print("image:", job["output"]["media"][0]["url"])
else:
    print("failed:", job["error"]["message"])

Create a generation

POST /v1/generations accepts a strict JSON body. Unknown top-level or nested keys are rejected with invalid_input.

Request

model
string
required
Model ID (e.g. flux-1.1-pro, veo-3, runway-gen4). Use the model ID exactly as listed on llm-stats.com.
input
object
required
Inputs to the generation. See Input fields.
n
integer
default:"1"
Number of images to generate (1–10). Image-only — ignored for video models.
wait
number | "auto"
default:"\"auto\""
Server-side long-poll window in seconds (0–60). Pass 0 for fire-and-forget (returns immediately with status: "queued"). "auto" picks a sensible default per modality.

Input fields

All input fields live under input and are optional unless noted.
prompt
string
required
The text prompt. 1–8000 characters.
images
string[]
Up to 8 reference image URLs for image-to-image and image-to-video models.
aspect_ratio
string
Aspect ratio in W:H form (e.g. "16:9", "1:1", "9:16"). Capped to the model’s supported set.
size
string
Explicit pixel size (e.g. "1024x1024"). Models that don’t accept size ignore this; pick aspect_ratio instead.
duration
number | string
Video clip duration in seconds (e.g. 8). Image models ignore this.
resolution
string
Video resolution (e.g. "720p", "1080p"). Image models ignore this.
seed
integer
Deterministic seed when supported by the provider.
negative_prompt
string
Concepts to discourage. Supported by some image models.

Response

Both endpoints return the same GenerationResponse shape — a single resource you can poll, store, and re-fetch.
id
string
Stable identifier for this generation.
object
string
Always "generation".
status
"queued" | "running" | "completed" | "failed" | "cancelled"
Lifecycle state. queued and running are non-terminal; the rest are terminal.
model
string
Echoes the model from the request.
created_at
string
ISO-8601 timestamp.
completed_at
string | null
Set once the generation reaches a terminal state.
output
object | null
Present once status === "completed". Contains a media array.
output.media[]
MediaArtifact[]
One entry per produced asset. See MediaArtifact.
usage
object | null
Present on terminal states. usage.cost_usd is the billed cost in USD.
error
object | null
Present on status === "failed". { code, message } — see Error codes.

MediaArtifact

type
"image" | "video"
url
string
Signed URL. Download or copy the asset before the URL expires (typically one hour).
format
string | null
e.g. "png", "jpeg", "mp4".
width
integer | null
height
integer | null
duration_seconds
number | null
Video only.

Example response (completed image)

{
  "id": "gen_01H…",
  "object": "generation",
  "status": "completed",
  "model": "flux-1.1-pro",
  "created_at": "2026-04-18T12:00:00Z",
  "completed_at": "2026-04-18T12:00:09Z",
  "output": {
    "media": [
      {
        "type": "image",
        "url": "https://…/gen_01H….png?X-Amz-Signature=…",
        "format": "png",
        "width": 1024,
        "height": 1024
      }
    ]
  },
  "usage": { "cost_usd": 0.04 }
}

Example response (still running)

If wait elapses without a terminal status, you get the same shape with status: "running" and a Retry-After header. Re-fetch the same id:
{
  "id": "gen_01H…",
  "object": "generation",
  "status": "running",
  "model": "veo-3",
  "created_at": "2026-04-18T12:00:00Z",
  "completed_at": null,
  "output": null,
  "usage": null,
  "error": null
}

Fetch a generation

GET /v1/generations/{id}
wait
number
default:"0"
Optional long-poll window (0–60s). Passing wait lets you GET once and block until the job is terminal, mirroring the POST ergonomics.
The response is identical to the POST shape. Terminal responses are safe to cache (Cache-Control: private, max-age=60); non-terminal responses are returned with Cache-Control: no-store.

How wait actually works

  • wait: "auto" (default). The server picks per modality — long for image jobs (which usually finish quickly), short for video (which usually doesn’t).
  • wait: 0. Fire-and-forget. The response always returns immediately; poll the resource yourself when you’re ready.
  • wait: N (1–60). Server holds the connection up to N seconds. The cap is below typical proxy idle timeouts, so you won’t hit gateway 504s.
The flow is the same in every case:
POST /v1/generations         →  {status: "running" | "completed" | "failed"}
↓ (still running?)
GET  /v1/generations/{id}?wait=60  →  …repeat until terminal
You never need to implement a polling cadence — every wait is server-side.

Error codes

Errors share the unified envelope. Generation-specific codes you’ll see most:
error.codeHTTPWhen it happens
invalid_input400Validation, unknown fields, out-of-range parameters.
unauthenticated401Missing or invalid API key.
insufficient_quota402Account out of credit or over plan limits.
model_unavailable403Model isn’t enabled for your account.
not_found404Unknown {id} on GET.
content_policy422Provider rejected the prompt or input image.
rate_limited429Slow down. Retry-After tells you for how long.
provider_unavailable502Every provider for this model returned an error.
provider_timeout504Every provider for this model timed out.
internal_error500Bug on our side — open a support ticket with the id.

Patterns and tips

Persist job.id immediately after POST. If your worker crashes or your user closes the tab, you can resume by re-fetching the same id — even hours later — and you’ll get the final state, including the signed URL.
wait caps a single request at 60s. Apply your own outer deadline (e.g. 5 minutes for image, 10 minutes for video) and bail out cleanly with the last id so the user can be notified later.
Pass reference URLs in input.images (max 8). The first reference is the primary input for single-reference models.
Set input.seed for deterministic outputs on supported models. Same model
  • same provider + same seed + same prompt → same image.