ZeroEval Python SDK

Start with the SDK quickstart

Build your first dataset, run a task, and compute quality metrics.

About the stack: LLM Stats runs on top of ZeroEval, the evaluation library developed by the same team behind LLM Stats.

Add data via git, CSV import, or the browser editor.

Pull datasets, access rows with dot notation, and work with slices/subsets.

Execute tasks with configurable workers, retries, and checkpoints.

Add row, column, and run-level evaluations for robust measurement.

Emit runtime signals during execution and inspect them through traces.

OpenAI-compatible chat, plus unified image, video, TTS, and STT endpoints.

Install and authenticate

Follow /python-sdk/installation to install zeroeval and configure ZEROEVAL_API_KEY.

Complete the first eval

Run the walkthrough in /python-sdk/quickstart.

Upload benchmark data

Push data via git, import CSVs, or use the browser editor. Learn about subsets, versioning, and the recommended repository layout.

Productionize eval execution

Add scoring, repetition, and resume/reliability controls.