Skip to main content
Beta. The MCP server is a new surface; tool descriptions and behaviour may evolve as we learn how customers use it. If something isn’t working the way you’d expect, please tell us.

Introduction

Ask your AI assistant questions about your Composo evaluation data in plain English. Connect Claude Desktop, Claude Code, or any MCP-capable client to your account, and the model can pull criteria, tags, bucketed aggregates, and individual traces to answer them — no SQL, no dashboards, no per-question REST calls.

When to Use It

  • Ad-hoc analysis with an LLM: ask “what’s the average helpfulness score by agent over the last week?” or “did anything regress in the last month?” and let the model call the right tools, instead of clicking through dashboards or writing a query.
  • Trace debugging: surface low-scoring or filter-narrowed examples directly into an LLM session for inspection.
  • Embedding evaluation insight into your own app: any tool catalogue your agent already exposes via MCP can include these read-side surfaces.

Connect

Pick the client you’re using. The same API key works across all of them — generate one from the API Keys settings page. Either an API-Key header or Authorization: Bearer <key> is accepted.

Claude Desktop

Add an entry under mcpServers in your Claude Desktop config (Settings → Developer → Edit Config):
{
  "mcpServers": {
    "composo": {
      "transport": "http",
      "url": "https://platform.composo.ai/mcp",
      "headers": {
        "API-Key": "your-composo-api-key"
      }
    }
  }
}
Restart Claude Desktop. The Composo tools appear in the tool picker; ask Claude a question about your evaluations and it will call them.

mcp CLI

mcp connect https://platform.composo.ai/mcp \
  --header "API-Key: your-composo-api-key"

Claude Code

claude mcp add --transport http composo https://platform.composo.ai/mcp \
  --header "API-Key: your-composo-api-key"
New claude sessions will load the server and surface its tools in-session.

Generic MCP client

Any MCP client that supports the Streamable HTTP transport can connect — point it at https://platform.composo.ai/mcp and send API-Key (or Authorization: Bearer) on each request.
from mcp.client.streamable_http import streamablehttp_client
from mcp.client.session import ClientSession

async with streamablehttp_client(
    "https://platform.composo.ai/mcp",
    headers={"API-Key": "your-composo-api-key"},
) as (read, write, _):
    async with ClientSession(read, write) as session:
        await session.initialize()
        result = await session.call_tool("list_criteria", {})

What’s Available

Once connected, your client sees six tools covering discovery, aggregation, and individual-trace browsing — no setup, just ask.
  • list_criteria — discover the evaluation criteria seen in your domain.
  • list_tag_keys / list_tag_values — discover the tag keys (e.g. agent, environment) and the distinct values used.
  • get_insights — bucketed aggregates (avg, count, stddev, min, max) per criterion over an arbitrary filter set.
  • get_grouped_insights — same, broken down by a tag value (by agent, by customer, etc.).
  • list_traces — page through individual traces with their full content and nested evaluations, optionally filtered by date, tag, criterion, or score.
All the read tools accept filters: narrow by date range, score range, criterion, or tag. For population-level questions reach for get_insights or get_grouped_insights; for individual examples use list_traces (paged 10 at a time).

FAQ

Can other Composo customers see my data? No. Every query is scoped to the account your API key belongs to; no tool takes a domain or customer argument. A key for one account cannot read another account’s data. Can I write evaluations or annotations through these tools? No — they’re read-only. Use the REST API or the Python SDK to submit evaluations, attach comments, or set ratings.