Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.composo.ai/llms.txt

Use this file to discover all available pages before exploring further.

Introduction

Composo runs a public Model Context Protocol server so you can query your evaluation data — criteria, tags, bucketed aggregates, and individual traces — from any MCP-capable LLM client (Claude Desktop, Claude Code, the mcp CLI, custom apps using the official SDK). The server is read-only and tenant-scoped: tools see only your own domain’s data, derived automatically from the API key you connect with. There is no “domain” or “customer” argument on any tool.

When to Use It

  • Ad-hoc analysis with an LLM: ask Claude “what’s the average helpfulness score by agent over the last week?” and let it call the right tool, instead of clicking through dashboards or writing a query.
  • Trace debugging: surface low-scoring or filter-narrowed examples directly into an LLM session for inspection.
  • Embedding evaluation insight into your own app: any tool catalogue your agent already exposes via MCP can include these read-side surfaces.
For programmatic write access (running new evaluations), use the Python SDK instead. MCP is complementary — read-side only.

Connect

The production MCP endpoint is:
https://platform.composo.ai/mcp
Authentication uses a standard Composo API key. Generate one from the API Keys settings page — the same key that authenticates the REST API and the Python SDK works here. Send the key either as an API-Key header or as Authorization: Bearer <key> — both are accepted, so client schemas that only support Bearer-style auth still work.

Claude Desktop

Add an entry under mcpServers in your Claude Desktop config (Settings → Developer → Edit Config):
{
  "mcpServers": {
    "composo": {
      "transport": "http",
      "url": "https://platform.composo.ai/mcp",
      "headers": {
        "API-Key": "your-composo-api-key"
      }
    }
  }
}
Restart Claude Desktop. The Composo tools appear in the tool picker; ask Claude a question about your evaluations and it will call them.

mcp CLI

mcp connect https://platform.composo.ai/mcp \
  --header "API-Key: your-composo-api-key"

Claude Code

claude mcp add --transport http composo https://platform.composo.ai/mcp \
  --header "API-Key: your-composo-api-key"
New claude sessions will load the server and surface its tools in-session.

Generic MCP client

Any MCP client that supports the Streamable HTTP transport can connect — point it at https://platform.composo.ai/mcp and send API-Key (or Authorization: Bearer) on each request.
from mcp.client.streamable_http import streamablehttp_client
from mcp.client.session import ClientSession

async with streamablehttp_client(
    "https://platform.composo.ai/mcp",
    headers={"API-Key": "your-composo-api-key"},
) as (read, write, _):
    async with ClientSession(read, write) as session:
        await session.initialize()
        result = await session.call_tool("list_criteria", {})

What’s Available

The server exposes six read-only tools covering discovery, aggregation, and individual-trace browsing. The tool catalogue and per-tool descriptions are served by the server itself — your LLM client sees them automatically on connect, and they cannot drift from what the server actually implements. At a glance:
  • list_criteria — discover the evaluation criteria seen in your domain.
  • list_tag_keys / list_tag_values — discover the tag keys (e.g. agent, environment) and the distinct values used.
  • get_insights — bucketed aggregates (avg, count, stddev, min, max) per criterion over an arbitrary filter set.
  • get_grouped_insights — same, broken down by a tag value (by agent, by customer, etc.).
  • list_traces — page through individual traces with their full content and nested evaluations, optionally filtered by date, tag, criterion, or score.
Filters compose into a single structured filters object across the filter-taking tools: date range, score range, criteria list, and tag filters (a key plus a list of allowed values). list_traces is hard-capped at 10 rows per page — use the aggregate tools for population-level questions.

FAQ

Is the MCP server tenant-isolated? Yes. Every tool reads the domain associated with your API key and scopes its query accordingly. A key for domain A cannot return data from domain B. Can I write evaluations or annotations through it? No. The MCP surface is read-only. Use the REST API or the Python SDK to submit evaluations, attach comments, or set ratings. What about rate limits? None are enforced today. We’ll add limits before broad rollout — contact us if you have a use case that needs them in advance.