Start with the SDK quickstart
Build your first dataset, run a task, and compute quality metrics.
About the stack: LLM Stats runs on top of ZeroEval, the evaluation library developed by the same team behind LLM Stats.
What You Can Do
Create datasets
Build datasets from Python lists or CSV files and push them to ZeroEval.
Load and inspect data
Pull datasets, access rows with dot notation, and work with slices/subsets.
Run evals
Execute tasks with configurable workers, retries, and checkpoints.
Score quality
Add row, column, and run-level evaluations for robust measurement.
Recommended Path
Install and authenticate
Follow
/python-sdk/installation to install zeroeval and configure ZEROEVAL_API_KEY.Complete the first eval
Run the walkthrough in
/python-sdk/quickstart.Documentation Scope
- Getting Started: setup and first run
- Datasets: creation, loading, versioning, subsets, multimodal
- Evals: execution, scoring, metrics, repetitions, resume
- Examples: end-to-end text and multimodal workflows