Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.llm-stats.com/llms.txt

Use this file to discover all available pages before exploring further.

The LLM Stats MCP server gives AI agents access to the model catalog, benchmark scores, rankings, and pricing over the Model Context Protocol. One workflow tool (compare-run) also runs prompts through the inference gateway and returns latency, tokens, and cost per model. The server is hosted and read-only over the Stats API. Cursor, Claude Code, Claude Desktop, Continue, and ChatGPT Apps connect directly.

Setup

Point your MCP client at the hosted server. No installation required.
Add this to your Cursor MCP settings (.cursor/mcp.json):
{
  "mcpServers": {
    "llm-stats": {
      "url": "https://mcp.llm-stats.com/mcp",
      "headers": {
        "Authorization": "Bearer ze_<your-api-key>"
      }
    }
  }
}

Authentication

The MCP server uses the same ze_... key as the Stats API and the Gateway API. You don’t need to provision a second key. Get yours from the developer console.
The same ze_... Bearer token authenticates the Stats API, the inference gateway, and the MCP server. Provision once, use everywhere.

Tools

Read tools

Read tools are safe to call at any time. They don’t modify state and don’t consume gateway credits.
ToolDescription
list-modelsList models with filters (organization, modality, price, context, sort, cursor).
get-modelFull detail for one model id: every benchmark score, provider pricing, and source.
list-benchmarksBenchmark catalog. Use it to disambiguate benchmark ids.
list-scoresScore matrix with filters across models and benchmarks.
get-rankingsTrueSkill rankings for a category.
list-updatesModels added in the last N days (1–30).

Workflow tools

Workflow tools chain multiple API calls. They’re still read-only.
ToolDescription
compare-modelsSide-by-side comparison of 2–4 model ids.
find-best-models-for-categoryTop-ranked models for a category, optionally filtered by price or open-weight.
resolve-modelFuzzy lookup: free text in, ranked candidate model_ids out. Use before any other tool when the id is fuzzy.

Gateway tools

Gateway tools call the inference gateway. The same ze_... key authenticates them.
compare-run hits the gateway and consumes credits. It’s annotated readOnlyHint: false and openWorldHint: true so MCP clients can prompt for approval before calling.
ToolDescription
compare-runRun a single prompt through 2–4 models in parallel; returns per-model latency, tokens, cost, and output. Capped at 2000 max tokens.
get-gateway-snippetReturn a ready-to-paste curl, Python, AsyncOpenAI, or JavaScript snippet for invoking a model through the gateway.

Resources

Seven MCP resources are available for introspection and reference lookups.
URIDescription
config://server-contextRedacted server config: auth mode, base URL, and connection status.
docs://capabilitiesCanonical tool, resource, prompt, and widget inventory with annotations.
data://benchmarksFull benchmark catalog ({ id, name, category, modality, max_score, verified }) so the LLM picks correct ids.
data://categoriesCategory ids (coding, math, reasoning, …) usable with get-rankings.
data://organizationsOrganization catalog ({ id, name }) derived from the model index.
data://providersProvider catalog ({ id, name }) derived from every model’s providers[].
recipe://gateway-quickstartPaste-ready gateway quickstart: base URL, auth, OpenAI compatibility, streaming, tools, multimodal, snippets.

Slash-command prompts

Four MCP prompts show up as slash-commands in clients like Cursor and Claude Desktop.
PromptArgumentsWhat it does
/compare-modelsmodel_a, model_bResolves each name with resolve-model, then calls compare-models with the resolved ids and writes a structured comparison.
/find-cheapest-fortask, optional max_input_priceMaps the task to a category via data://categories, calls find-best-models-for-category, explains the price/quality tradeoff.
/whats-new-this-week(none)Calls list-updates with days=7 (falling back to 14), summarizes notable releases grouped by org.
/explain-benchmarkbenchmark_id_or_nameDisambiguates via data://benchmarks, fetches top scores via list-scores, writes a SOTA and saturation summary.

Interactive widgets

Seven tools render an interactive widget alongside a markdown summary. MCP Inspector, ChatGPT Apps, Cursor, Claude Desktop, VS Code, and Goose render both. Text-only clients fall back to the markdown.
WidgetBacking tool(s)
leaderboardlist-models, find-best-models-for-category
model-cardget-model
model-compare-tablecompare-models
benchmark-leaderboardlist-scores (sorted, single-benchmark)
rankingsget-rankings
updates-timelinelist-updates
compare-runcompare-run