Quickstart
Request
The body ismultipart/form-data with the following fields:
Audio file. Up to 25 MB. Supported formats:
wav, mp3, m4a, mp4,
webm, ogg, opus, flac.STT model ID (e.g.
whisper-1, deepgram-nova-3).ISO-639 language hint (e.g.
"en", "es"). Skips auto-detection where
supported.Force a specific provider for this request. Bypasses the router — use
sparingly, only when you need parity with a baseline.
Response
Full transcript.
Audio duration in seconds.
Detected (or supplied) language code.
Overall confidence between 0 and 1, when the provider exposes it.
Per-word timestamps and confidences, when the provider supports them.
Streaming
For real-time transcription, open a WebSocket towss://gateway.llm-stats.com/v1/stt/stream and stream PCM audio frames. The
batch HTTP endpoint above is the right choice for files you already have on
disk.
Errors
Failures use the shared error envelope. Common ones:| Status | error.code | When |
|---|---|---|
400 | invalid_input | Missing fields, unsupported format. |
401 | unauthenticated | Missing or invalid API key. |
402 | insufficient_quota | Out of credit. |
413 | invalid_input | File larger than 25 MB. |
429 | rate_limited | Quota exceeded — back off using Retry-After. |
502 | provider_unavailable | Every STT provider for this model errored. |