MCP Server - LLM Stats

The LLM Stats MCP server gives AI agents access to the model catalog, benchmark scores, rankings, and pricing over the Model Context Protocol. One workflow tool (compare-run) also runs prompts through the inference gateway and returns latency, tokens, and cost per model. The server is hosted and read-only over the Stats API. Cursor, Claude Code, Claude Desktop, Continue, and ChatGPT Apps connect directly.

Setup

Point your MCP client at the hosted server. No installation required.

Add this to your Cursor MCP settings (.cursor/mcp.json):

{
  "mcpServers": {
    "llm-stats": {
      "url": "https://mcp.llm-stats.com/mcp",
      "headers": {
        "Authorization": "Bearer ze_<your-api-key>"
      }
    }
  }
}

claude mcp add llm-stats --transport http https://mcp.llm-stats.com/mcp \
  --header "Authorization: Bearer ze_<your-api-key>"

Add this to claude_desktop_config.json:

{
  "mcpServers": {
    "llm-stats": {
      "command": "npx",
      "args": ["-y", "mcp-use", "client", "connect", "https://mcp.llm-stats.com/mcp"],
      "env": { "MCP_AUTHORIZATION": "Bearer ze_<your-api-key>" }
    }
  }
}

Add this to your Continue config.json:

{
  "experimental": {
    "modelContextProtocolServers": [
      {
        "name": "llm-stats",
        "url": "https://mcp.llm-stats.com/mcp",
        "headers": { "Authorization": "Bearer ze_<your-api-key>" }
      }
    ]
  }
}

Configure your app with the hosted MCP URL and store the Bearer token in your platform’s secrets manager. ChatGPT renders every widget the server exposes.

URL: https://mcp.llm-stats.com/mcp
Header: Authorization: Bearer ze_<your-api-key>

Any client that supports HTTP transport works. Set the server URL to https://mcp.llm-stats.com/mcp and pass your API key in the Authorization: Bearer header.

Authentication

The MCP server uses the same ze_... key as the Stats API and the Gateway API. You don’t need to provision a second key. Get yours from the developer console.

The same ze_... Bearer token authenticates the Stats API, the inference gateway, and the MCP server. Provision once, use everywhere.

Tools

Read tools

Read tools are safe to call at any time. They don’t modify state and don’t consume gateway credits.

Tool	Description
`list-models`	List models with filters (organization, modality, price, context, sort, cursor).
`get-model`	Full detail for one model id: every benchmark score, provider pricing, and source.
`list-benchmarks`	Benchmark catalog. Use it to disambiguate benchmark ids.
`list-scores`	Score matrix with filters across models and benchmarks.
`get-rankings`	TrueSkill rankings for a category.
`list-updates`	Models added in the last N days (1–30).

Workflow tools

Workflow tools chain multiple API calls. They’re still read-only.

Tool	Description
`compare-models`	Side-by-side comparison of 2–4 model ids.
`find-best-models-for-category`	Top-ranked models for a category, optionally filtered by price or open-weight.
`resolve-model`	Fuzzy lookup: free text in, ranked candidate `model_id`s out. Use before any other tool when the id is fuzzy.

Gateway tools

Gateway tools call the inference gateway. The same ze_... key authenticates them.

compare-run hits the gateway and consumes credits. It’s annotated readOnlyHint: false and openWorldHint: true so MCP clients can prompt for approval before calling.

Tool	Description
`compare-run`	Run a single prompt through 2–4 models in parallel; returns per-model latency, tokens, cost, and output. Capped at 2000 max tokens.
`get-gateway-snippet`	Return a ready-to-paste curl, Python, AsyncOpenAI, or JavaScript snippet for invoking a model through the gateway.

Resources

Seven MCP resources are available for introspection and reference lookups.

URI	Description
`config://server-context`	Redacted server config: auth mode, base URL, and connection status.
`docs://capabilities`	Canonical tool, resource, prompt, and widget inventory with annotations.
`data://benchmarks`	Full benchmark catalog (`{ id, name, category, modality, max_score, verified }`) so the LLM picks correct ids.
`data://categories`	Category ids (`coding`, `math`, `reasoning`, …) usable with `get-rankings`.
`data://organizations`	Organization catalog (`{ id, name }`) derived from the model index.
`data://providers`	Provider catalog (`{ id, name }`) derived from every model’s `providers[]`.
`recipe://gateway-quickstart`	Paste-ready gateway quickstart: base URL, auth, OpenAI compatibility, streaming, tools, multimodal, snippets.

Slash-command prompts

Four MCP prompts show up as slash-commands in clients like Cursor and Claude Desktop.

Prompt	Arguments	What it does
`/compare-models`	`model_a`, `model_b`	Resolves each name with `resolve-model`, then calls `compare-models` with the resolved ids and writes a structured comparison.
`/find-cheapest-for`	`task`, optional `max_input_price`	Maps the task to a category via `data://categories`, calls `find-best-models-for-category`, explains the price/quality tradeoff.
`/whats-new-this-week`	(none)	Calls `list-updates` with `days=7` (falling back to 14), summarizes notable releases grouped by org.
`/explain-benchmark`	`benchmark_id_or_name`	Disambiguates via `data://benchmarks`, fetches top scores via `list-scores`, writes a SOTA and saturation summary.

Interactive widgets

Seven tools render an interactive widget alongside a markdown summary. MCP Inspector, ChatGPT Apps, Cursor, Claude Desktop, VS Code, and Goose render both. Text-only clients fall back to the markdown.

Widget	Backing tool(s)
`leaderboard`	`list-models`, `find-best-models-for-category`
`model-card`	`get-model`
`model-compare-table`	`compare-models`
`benchmark-leaderboard`	`list-scores` (sorted, single-benchmark)
`rankings`	`get-rankings`
`updates-timeline`	`list-updates`
`compare-run`	`compare-run`

Documentation Index

​Setup

​Authentication

​Tools

​Read tools

​Workflow tools

​Gateway tools

​Resources

​Slash-command prompts

​Interactive widgets

Setup

Authentication

Tools

Read tools

Workflow tools

Gateway tools

Resources

Slash-command prompts

Interactive widgets