POST once, the server holds the connection open while the job runs, and you get back a final response with a direct URL to the asset. If the job needs longer than the wait window, you GET the same id with another wait — no client-side polling cadence to manage.
At a glance
- One resource.
POST /v1/generationsfor every image and video model. - Server-side long-polling.
wait="auto"(default) blocks for up to 60s — most short jobs return ascompletedin a single round-trip. - Strict input. Unknown fields are rejected, so typos surface immediately.
- Stable error codes. Branch on
error.code, never onerror.message.
Quickstart
Create a generation
POST /v1/generations accepts a strict JSON body. Unknown top-level or nested keys are rejected with invalid_input.
Request
Model ID (e.g.
flux-1.1-pro, veo-3, runway-gen4). Use the model ID
exactly as listed on llm-stats.com.Inputs to the generation. See Input fields.
Number of images to generate (1–10). Image-only — ignored for video models.
Server-side long-poll window in seconds (0–60). Pass
0 for fire-and-forget
(returns immediately with status: "queued"). "auto" picks a sensible
default per modality.Input fields
All input fields live underinput and are optional unless noted.
The text prompt. 1–8000 characters.
Up to 8 reference image URLs for image-to-image and image-to-video models.
Aspect ratio in
W:H form (e.g. "16:9", "1:1", "9:16"). Capped to the
model’s supported set.Explicit pixel size (e.g.
"1024x1024"). Models that don’t accept size
ignore this; pick aspect_ratio instead.Video clip duration in seconds (e.g.
8). Image models ignore this.Video resolution (e.g.
"720p", "1080p"). Image models ignore this.Deterministic seed when supported by the provider.
Concepts to discourage. Supported by some image models.
Response
Both endpoints return the sameGenerationResponse shape — a single resource you can poll, store, and re-fetch.
Stable identifier for this generation.
Always
"generation".Lifecycle state.
queued and running are non-terminal; the rest are
terminal.Echoes the
model from the request.ISO-8601 timestamp.
Set once the generation reaches a terminal state.
Present once
status === "completed". Contains a media array.One entry per produced asset. See MediaArtifact.
Present on terminal states.
usage.cost_usd is the billed cost in USD.Present on
status === "failed". { code, message } — see
Error codes.MediaArtifact
Signed URL. Download or copy the asset before the URL expires (typically
one hour).
e.g.
"png", "jpeg", "mp4".Video only.
Example response (completed image)
Example response (still running)
Ifwait elapses without a terminal status, you get the same shape with
status: "running" and a Retry-After header. Re-fetch the same id:
Fetch a generation
Optional long-poll window (0–60s). Passing
wait lets you GET once and
block until the job is terminal, mirroring the POST ergonomics.POST shape. Terminal responses are safe to
cache (Cache-Control: private, max-age=60); non-terminal responses are
returned with Cache-Control: no-store.
How wait actually works
wait: "auto"(default). The server picks per modality — long for image jobs (which usually finish quickly), short for video (which usually doesn’t).wait: 0. Fire-and-forget. The response always returns immediately; poll the resource yourself when you’re ready.wait: N(1–60). Server holds the connection up toNseconds. The cap is below typical proxy idle timeouts, so you won’t hit gateway 504s.
Error codes
Errors share the unified envelope. Generation-specific codes you’ll see most:error.code | HTTP | When it happens |
|---|---|---|
invalid_input | 400 | Validation, unknown fields, out-of-range parameters. |
unauthenticated | 401 | Missing or invalid API key. |
insufficient_quota | 402 | Account out of credit or over plan limits. |
model_unavailable | 403 | Model isn’t enabled for your account. |
not_found | 404 | Unknown {id} on GET. |
content_policy | 422 | Provider rejected the prompt or input image. |
rate_limited | 429 | Slow down. Retry-After tells you for how long. |
provider_unavailable | 502 | Every provider for this model returned an error. |
provider_timeout | 504 | Every provider for this model timed out. |
internal_error | 500 | Bug on our side — open a support ticket with the id. |
Patterns and tips
Always use the resource id, never poll a wall clock
Always use the resource id, never poll a wall clock
Persist
job.id immediately after POST. If your worker crashes or your
user closes the tab, you can resume by re-fetching the same id — even
hours later — and you’ll get the final state, including the signed URL.Set realistic outer timeouts
Set realistic outer timeouts
wait caps a single request at 60s. Apply your own outer deadline (e.g.
5 minutes for image, 10 minutes for video) and bail out cleanly with the
last id so the user can be notified later.Image-to-image / image-to-video
Image-to-image / image-to-video
Pass reference URLs in
input.images (max 8). The first reference is the
primary input for single-reference models.Reproducibility
Reproducibility
Set
input.seed for deterministic outputs on supported models. Same model- same provider + same seed + same prompt → same image.