Image & video generations

POST https://gateway.llm-stats.com/v1/generations
GET  https://gateway.llm-stats.com/v1/generations/{id}

Image and video generation share a single resource. You POST once, the server holds the connection open while the job runs, and you get back a final response with a direct URL to the asset. If the job needs longer than the wait window, you GET the same id with another wait — no client-side polling cadence to manage.

At a glance

One resource. POST /v1/generations for every image and video model.
Server-side long-polling. wait="auto" (default) blocks for up to 60s — most short jobs return as completed in a single round-trip.
Strict input. Unknown fields are rejected, so typos surface immediately.
Stable error codes. Branch on error.code, never on error.message.

Quickstart

import requests

# 1. Create the generation. wait="auto" long-polls (up to 60s).
res = requests.post(
    "https://gateway.llm-stats.com/v1/generations",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "model": "flux-1.1-pro",
        "input": {"prompt": "A beautiful sunset over mountains"},
        "wait": "auto",
    },
)
job = res.json()

# 2. If it didn't finish in time, poll the same resource.
while job["status"] in ("queued", "running"):
    res = requests.get(
        f"https://gateway.llm-stats.com/v1/generations/{job['id']}",
        params={"wait": 60},
        headers={"Authorization": "Bearer YOUR_API_KEY"},
    )
    job = res.json()

if job["status"] == "completed":
    print("image:", job["output"]["media"][0]["url"])
else:
    print("failed:", job["error"]["message"])

Create a generation

POST /v1/generations accepts a strict JSON body. Unknown top-level or nested keys are rejected with invalid_input.

Request

model

string

required

Model ID (e.g. flux-1.1-pro, veo-3, runway-gen4). Use the model ID exactly as listed on llm-stats.com.

input

object

required

Inputs to the generation. See Input fields.

integer

default:"1"

Number of images to generate (1–10). Image-only — ignored for video models.

wait

number | "auto"

default:"\"auto\""

Server-side long-poll window in seconds (0–60). Pass 0 for fire-and-forget (returns immediately with status: "queued"). "auto" picks a sensible default per modality.

Input fields

All input fields live under input and are optional unless noted.

prompt

string

required

The text prompt. 1–8000 characters.

images

string[]

Up to 8 reference image URLs for image-to-image and image-to-video models.

aspect_ratio

string

Aspect ratio in W:H form (e.g. "16:9", "1:1", "9:16"). Capped to the model’s supported set.

size

string

Explicit pixel size (e.g. "1024x1024"). Models that don’t accept size ignore this; pick aspect_ratio instead.

duration

number | string

Video clip duration in seconds (e.g. 8). Image models ignore this.

resolution

string

Video resolution (e.g. "720p", "1080p"). Image models ignore this.

seed

integer

Deterministic seed when supported by the provider.

negative_prompt

string

Concepts to discourage. Supported by some image models.

Response

Both endpoints return the same GenerationResponse shape — a single resource you can poll, store, and re-fetch.

string

Stable identifier for this generation.

object

string

Always "generation".

status

"queued" | "running" | "completed" | "failed" | "cancelled"

Lifecycle state. queued and running are non-terminal; the rest are terminal.

model

string

Echoes the model from the request.

created_at

string

ISO-8601 timestamp.

completed_at

string | null

Set once the generation reaches a terminal state.

output

object | null

Present once status === "completed". Contains a media array.

output.media[]

MediaArtifact[]

One entry per produced asset. See MediaArtifact.

usage

object | null

Present on terminal states. usage.cost_usd is the billed cost in USD.

error

object | null

Present on status === "failed". { code, message } — see Error codes.

MediaArtifact

type

"image" | "video"

url

string

Signed URL. Download or copy the asset before the URL expires (typically one hour).

format

string | null

e.g. "png", "jpeg", "mp4".

width

integer | null

height

integer | null

duration_seconds

number | null

Video only.

Example response (completed image)

{
  "id": "gen_01H…",
  "object": "generation",
  "status": "completed",
  "model": "flux-1.1-pro",
  "created_at": "2026-04-18T12:00:00Z",
  "completed_at": "2026-04-18T12:00:09Z",
  "output": {
    "media": [
      {
        "type": "image",
        "url": "https://…/gen_01H….png?X-Amz-Signature=…",
        "format": "png",
        "width": 1024,
        "height": 1024
      }
    ]
  },
  "usage": { "cost_usd": 0.04 }
}

Example response (still running)

If wait elapses without a terminal status, you get the same shape with status: "running" and a Retry-After header. Re-fetch the same id:

{
  "id": "gen_01H…",
  "object": "generation",
  "status": "running",
  "model": "veo-3",
  "created_at": "2026-04-18T12:00:00Z",
  "completed_at": null,
  "output": null,
  "usage": null,
  "error": null
}

Fetch a generation

GET /v1/generations/{id}

wait

number

default:"0"

Optional long-poll window (0–60s). Passing wait lets you GET once and block until the job is terminal, mirroring the POST ergonomics.

The response is identical to the POST shape. Terminal responses are safe to cache (Cache-Control: private, max-age=60); non-terminal responses are returned with Cache-Control: no-store.

How `wait` actually works

wait: "auto" (default). The server picks per modality — long for image jobs (which usually finish quickly), short for video (which usually doesn’t).
wait: 0. Fire-and-forget. The response always returns immediately; poll the resource yourself when you’re ready.
wait: N (1–60). Server holds the connection up to N seconds. The cap is below typical proxy idle timeouts, so you won’t hit gateway 504s.

The flow is the same in every case:

POST /v1/generations         →  {status: "running" | "completed" | "failed"}
↓ (still running?)
GET  /v1/generations/{id}?wait=60  →  …repeat until terminal

You never need to implement a polling cadence — every wait is server-side.

Error codes

Errors share the unified envelope. Generation-specific codes you’ll see most:

`error.code`	HTTP	When it happens
`invalid_input`	400	Validation, unknown fields, out-of-range parameters.
`unauthenticated`	401	Missing or invalid API key.
`insufficient_quota`	402	Account out of credit or over plan limits.
`model_unavailable`	403	Model isn’t enabled for your account.
`not_found`	404	Unknown `{id}` on `GET`.
`content_policy`	422	Provider rejected the prompt or input image.
`rate_limited`	429	Slow down. `Retry-After` tells you for how long.
`provider_unavailable`	502	Every provider for this model returned an error.
`provider_timeout`	504	Every provider for this model timed out.
`internal_error`	500	Bug on our side — open a support ticket with the `id`.

Patterns and tips

Always use the resource id, never poll a wall clock

Persist job.id immediately after POST. If your worker crashes or your user closes the tab, you can resume by re-fetching the same id — even hours later — and you’ll get the final state, including the signed URL.

Set realistic outer timeouts

wait caps a single request at 60s. Apply your own outer deadline (e.g. 5 minutes for image, 10 minutes for video) and bail out cleanly with the last id so the user can be notified later.

Image-to-image / image-to-video

Pass reference URLs in input.images (max 8). The first reference is the primary input for single-reference models.

Reproducibility

Set input.seed for deterministic outputs on supported models. Same model

same provider + same seed + same prompt → same image.

Overview

Endpoints

Image & video generations

At a glance

Quickstart

Create a generation

Request

Input fields

Response

MediaArtifact

Example response (completed image)

Example response (still running)

Fetch a generation

How `wait` actually works

Error codes

Patterns and tips

Overview

Endpoints

​At a glance

​Quickstart

​Create a generation

​Request

​Input fields

​Response

​MediaArtifact

​Example response (completed image)

​Example response (still running)

​Fetch a generation

​How wait actually works

​Error codes

​Patterns and tips

At a glance

Quickstart

Create a generation

Request

Input fields

Response

MediaArtifact

Example response (completed image)

Example response (still running)

Fetch a generation

How `wait` actually works

Error codes

Patterns and tips