- Dual Client Support: Both synchronous and asynchronous clients
- Convenient Format: Compatible with python dictionaries and results objects from OpenAI and Anthropic
- HTTP Goodies: Connection pooling + retry logic
Note: This SDK is for Python users. If you’re using TypeScript, JavaScript, or other languages, please refer to the REST API Reference to call the API directly.
Installation
Install the SDK using pip:Quick Start
Let’s run a simple Hello World evaluation to get started with Composo evaluation.Python
Reference
Client Parameters
BothComposo
and AsyncComposo
clients accept the following parameters during instantiation:
Parameter | Type | Required | Default | Description |
---|---|---|---|---|
api_key | str | No* | None | Your Composo API key. If not provided, will use COMPOSO_API_KEY environment variable |
model_core | str | No | Lastest Align model | Specify the model to use for evaluation. Options: align-20250529 , align-lightning-20250731 |
num_retries | int | No | 1 | Number of retry attempts for failed requests |
COMPOSO_API_KEY
environment variable is not set.
Lightning model does not currently support agents and tool calling, for that evaluation you must be using the default align model.
Evaluation Method Parameters
Theevaluate()
method accepts the following parameters:
Parameter | Type | Required | Description |
---|---|---|---|
messages | List[Dict] | Yes | List of message dictionaries with ‘role’ and ‘content’ keys |
criteria | str or List[str] | Yes | Evaluation criteria (single string or list of criteria) |
tools | List[Dict] | No | Tool definitions for evaluating tool calls |
result | OpenAI/Anthropic Result Object | No | Pre-computed LLM result object to evaluate |
Environment Variables
The SDK supports the following environment variables:COMPOSO_API_KEY
: Your Composo API key (used whenapi_key
parameter is not provided)
Response Format
Theevaluate
method returns an EvaluationResponse
object:
Python
Async Evaluation
Use the async client when you need to run multiple evaluations concurrently or integrate with async workflows.Python
Multiple Criteria Evaluation
When evaluating against multiple criteria, the async client runs all evaluations concurrently for better performance.Python
Evaluating OpenAI/Anthropic Outputs
You can directly evaluate the result of a call to the OpenAI SDK by passing the return of completions.create to composo evaluate. N.B. Composo will always evaluate choices[0].Python
Error Handling
The SDK provides specific exception types:Python
Logging
The SDK uses Python’s standard logging module. Configure logging level:Python