Skip to main content

Introduction

Composo’s tracing SDK enables you to capture and evaluate LLM calls from your agent applications in real-time. Currently supporting DIY agents built on OpenAI - with support for Anthropic, LangChain/LangGraph and other SDKs to come.

Why Tracing Matters

Many agent frameworks abstract away the underlying LLM calls, making it difficult to understand what’s happening under the hood and evaluate performance effectively. Many evaluation platforms only let you send traces to a remote system and wait to view results later. Composo gives you the best of both worlds: trace and evaluate immediately, or view your traces in our platform or any of your own observability tooling, spreadsheets or CICD seamlessly. By instrumenting your LLM calls and marking agent boundaries, you can evaluate performance in real-time and take action right away - allowing adjustment and feedback in real time before it gets seen by your users.

Key Features

  • Mark Agent Boundaries: Use AgentTracer context manager or @agent_tracer decorator to define which LLM calls belong to which agent
  • Hierarchical Tracing: Support for nested agents to model complex multi-agent architectures
  • Independent Evaluation: Each agent’s performance is evaluated separately with average, min, max and standard-deviation statistics reported per agent
  • Flexible Evaluation: Get evaluation results instantly in your code, or view traces in the Composo platform for deeper analysis (or through seamless sync with any observability platform like Grafana, Sentry, Langfuse, LangSmith, Braintrust)

Framework Support

  • Currently Supported: Agents built on OpenAI LLMs
  • Coming Soon: Anthropic, Langchain, OpenAI Agents, and other popular frameworks

Quickstart

This guide walks you through adding tracing to your agent application in 3 steps. We’ll start with a simple multi-agent application and add tracing incrementally.

Starting Code

Here’s a simple multi-agent application we want to trace:
from openai import OpenAI

open_ai_client = OpenAI()

def agent_2():
    return open_ai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "B"}],
        max_tokens=5,
    )

# Orchestrator agent
response1 = open_ai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "A"}],
    max_tokens=5,
)

response2 = agent_2()

Step 1: Install and Initialize

Install the Composo SDK and initialize tracing for OpenAI.
pip install composo
Add these imports and initialization:
# Add these imports at the top
from composo.tracing import ComposoTracer, Instruments, AgentTracer, agent_tracer
from composo.models import criteria
from composo import Composo

# Initialize tracing and Composo client (add after imports)
ComposoTracer.init(instruments=[Instruments.OPENAI])
composo_client = Composo(
    api_key="your_composo_key"
)

Step 2: Mark Your Agent Boundaries

Wrap your agent logic with AgentTracer or @agent_tracer to mark boundaries. For the function-based agent, add the decorator:
# Add decorator to agent_2
@agent_tracer(name="agent2")
def agent_2():
    return open_ai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "B"}],
        max_tokens=5,
    )
For the orchestrator, wrap with AgentTracer context manager:
# Wrap orchestrator logic
with AgentTracer("orchestrator") as tracer:
    with AgentTracer("agent1"):
        response1 = open_ai_client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": "A"}],
            max_tokens=5,
        )
    response2 = agent_2()
Note: tracer object from the root AgentTracer is needed for evaluation in Step 3.

Step 3: Evaluate Your Trace

Add evaluation after your agents complete:
# Evaluate the trace (add after agent execution)
for result, criterion in zip(
    composo_client.evaluate_trace(tracer.trace, criteria=criteria.agent),
    criteria.agent
):
    print("Criteria:", criterion)
    print(f"Evaluation Result: {result}\n")
Here, we are running the Composo agent evaluation framework with criteria.agent, but you can use any criterion here, as shown in the Agent evaluation section of our docs here. As long as you start your criteria with ‘Reward agents’ it’ll work.

Complete Example

from composo.tracing import ComposoTracer, Instruments, AgentTracer, agent_tracer
from composo.models import criteria
from composo import Composo
from openai import OpenAI

# Instrument OpenAI
ComposoTracer.init(instruments=[Instruments.OPENAI])
composo_client = Composo(
    api_key="your_composo_key"
)
open_ai_client = OpenAI()

# agent_tracer decorator marks any LLM calls inside as belonging to agent2
@agent_tracer(name="agent2")
def agent_2():
    return open_ai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "B"}],
        max_tokens=5,
    )

# AgentTracer context manager marks any LLM calls inside as belonging to orchestrator
# Has the added benefit of returning a tracer object that can be used for evaluation!
with AgentTracer("orchestrator") as tracer:
    with AgentTracer("agent1"):
        response1 = open_ai_client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": "A"}],
            max_tokens=5,
        )
    response2 = agent_2()

for result, criterion in zip(
    composo_client.evaluate_trace(tracer.trace, criteria=criteria.agent),
    criteria.agent
):
    print("Criteria:", criterion)
    print(f"Evaluation Result: {result}\n")

API Reference

ComposoTracer.init

Initializes the Composo tracing system and instruments specified LLM libraries to automatically capture their API calls.

Parameters

  • instruments (list[Instruments]): List of LLM libraries to instrument for tracing. Currently supported:
    • Instruments.OPENAI - Instruments OpenAI client to trace all chat completion calls

Usage

from composo.tracing import ComposoTracer, Instruments

# Initialize tracing with OpenAI instrumentation
ComposoTracer.init(instruments=[Instruments.OPENAI])
Call this once at the start of your application before making any LLM calls.

@agent_tracer

Decorator that marks all LLM calls within a function as belonging to a specific agent. Use this for functional agent implementations.

Parameters

  • name (str): The name of the agent for tracing purposes

Usage

from composo.tracing import agent_tracer

@agent_tracer(name="agent2")
def my_agent():
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello"}],
    )
    return response
All LLM calls within the decorated function are automatically associated with the specified agent name.

AgentTracer

Context manager that marks all LLM calls within its scope as belonging to a specific agent. Returns a tracer object that can be used for evaluation.

Parameters

  • name (str): The name of the agent for tracing purposes

Returns

  • tracer: Tracer object containing the trace attribute with captured LLM calls

Usage

from composo.tracing import AgentTracer

# Use as context manager
with AgentTracer("orchestrator") as tracer:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello"}],
    )

    # Access the trace for evaluation
    trace_data = tracer.trace

Nested Agents

AgentTracer supports nesting to model hierarchical agent architectures:
with AgentTracer("orchestrator") as tracer:
    # Calls here belong to orchestrator

    with AgentTracer("agent1"):
        # Calls here belong to agent1
        pass

    with AgentTracer("agent2"):
        # Calls here belong to agent2
        pass

evaluate_trace

Evaluates captured LLM traces against specified criteria. Composo evaluates each agent independently and reports statistics (average, min, max, std) for scores within each agent.

Parameters

  • trace: Trace object returned by AgentTracer context manager (accessed via tracer.trace)
  • criteria: Evaluation criteria to apply to the trace (e.g., criteria.agent)

Usage

from composo import Composo
from composo.models import criteria
from composo.tracing import AgentTracer

# Capture trace
with AgentTracer("my_agent") as tracer:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello"}],
    )

# Evaluate the trace
composo_client = Composo(api_key="your_api_key")
composo_client.evaluate_trace(tracer.trace, criteria=criteria.agent)

Next Steps

I