Speech-to-text - LLM Stats

POST https://gateway.llm-stats.com/v1/stt/transcribe

Upload an audio file as multipart form data, get a JSON transcript back.

Quickstart

import requests

with open("audio.mp3", "rb") as f:
    response = requests.post(
        "https://gateway.llm-stats.com/v1/stt/transcribe",
        headers={"Authorization": "Bearer YOUR_API_KEY"},
        files={"audio": f},
        data={"model_id": "whisper-1"},
    )

print(response.json()["text"])

Request

The body is multipart/form-data with the following fields:

audio

file

required

Audio file. Up to 25 MB. Supported formats: wav, mp3, m4a, mp4, webm, ogg, opus, flac.

model_id

string

required

STT model ID (e.g. whisper-1, deepgram-nova-3).

language

string

ISO-639 language hint (e.g. "en", "es"). Skips auto-detection where supported.

provider_id

string

Force a specific provider for this request. Bypasses the router — use sparingly, only when you need parity with a baseline.

Response

{
  "text": "Hello, this is a test transcription.",
  "duration": 2.45,
  "language": "en",
  "confidence": 0.97,
  "words": [
    { "word": "Hello", "start": 0.00, "end": 0.42, "confidence": 0.98 },
    { "word": "this",  "start": 0.55, "end": 0.74, "confidence": 0.97 }
  ],
  "model": "whisper-1"
}

text

string

Full transcript.

duration

number

Audio duration in seconds.

language

string | null

Detected (or supplied) language code.

confidence

number | null

Overall confidence between 0 and 1, when the provider exposes it.

words

array | null

Per-word timestamps and confidences, when the provider supports them.

Streaming

For real-time transcription, open a WebSocket to wss://gateway.llm-stats.com/v1/stt/stream and stream PCM audio frames. The batch HTTP endpoint above is the right choice for files you already have on disk.

Errors

Failures use the shared error envelope. Common ones:

Status	`error.code`	When
`400`	`invalid_input`	Missing fields, unsupported format.
`401`	`unauthenticated`	Missing or invalid API key.
`402`	`insufficient_quota`	Out of credit.
`413`	`invalid_input`	File larger than 25 MB.
`429`	`rate_limited`	Quota exceeded — back off using `Retry-After`.
`502`	`provider_unavailable`	Every STT provider for this model errored.

​Quickstart

​Request

​Response

​Streaming

​Errors

Quickstart

Request

Response

Streaming

Errors